Simulating Faulty Windows Services

A friend of mine wants to simulate a faulty Windows Service, so today we'll talk about something other than PowerShell... okay maybe we'll talk a little bit about PowerShell as well ;-)

Windows Services

Anyone having spent more than 5 minutes managing one or more instances of Windows Server knows about services - these long-running processes that servers run in the background.

The idea is very simple - you wrap your code in an executable that exposes a few functions that the Service Control Manager in Windows can hook into in order to control and instrument to execution of your code.

This architecture also allows SCM to indicate a pending shutdown or stop signal to a service process so that it can free up external resources and otherwise clean up after itself.

When you issue Stop-Service svcname, sc.exe stop svcname or open services.msc or the task manager and right-click -> "Stop":

"Stop" from services.msc

... the SCM sends a STOP signal to the associated process. Great stuff!

Services that just won't stop...

If you spend another 5 minutes managing an application server running some specialized LOB service application (read: any fragile legacy shitware written by the CEO's nephew during a summer internship), you'll find that some Windows Service implementations are not that good at honouring a STOP signal from SCM - for any one of multiple reasons:

Cleanup just takes a long time
Resource contention*
The author of the software is a lazy and/or wise guy

This means that when you attempt to stop the service for one reason or another, you start seeing stuff like:

*) I've seen this happen multiple times with service processes unable to release TCP sockets acquired by a child process due to the way TCP is handled Windows, but it could be any resource - waiting for locks on shared log files for example.

Abort, abort!

I have a friend in the Hosting industry who, among other things, is responsible for authoring tools that assist his colleagues in OS and application patching.

In certain situations, they need to shut down services associated with the kind of "line-of-business application" I alluded to above, before manually patching the underlying application.

Without further commenting on the practice of patching-by-hand or writing shitty "enterprise-grade software" that can't shut down properly, this is what he asked me:

I've written a script that stop services in an orderly fashion, and then kill the associated process after a timeout - how can I test this functionality?

So, my friend is interested in being able to test his scripts without touching any production boxes (good for him), and without having to install the faulty application in his test environment.

So, how would one go about that? Well, we could write some code that simulates the service in question of course!

Windows Services and .NET

Remember that stuff about the Service Control Manager hooking into pre-defined functions at the top? The .NET framework comes with a class called ServiceBase that allows us to easily have a .NET application expose these exact functions.

Visual Studio even comes with a project template that does all the basic wiring for us, let's check it out!

Step #1 - Create a "Windows Service" project

Download, install and open any edition of Visual Studio!
Create a new project
In the New Project window, search for "Windows Service".
Choose a target version of the .NET framework supported on your servers
Select the "Windows Service" template and hit OK

^{Screenshot above is from Visual Studio Professional 2015 but any version and/or edition will do}

Step #2 - Write some code!

Now that we have our Windows Service project up and running, it's time to write some code!

The ServiceBase class that we're extending defines a method that executes as soon as the process receives a STOP signal from SCM - the OnStop() method.

All we need to do is override that method signature, and we can start raising all kinds of havoc. When using the "Windows Service" template, Visual Studio will have already done this for us, and all we need to supply is the method body:

protected override void OnStop()
{
    // This is where our shenanigans will originate
}

... okay, maybe Visual Studio didn't generate that comment, but that's all I see when I look at it.

The simplest way I can think of simulating a service process that just wont listen, is by implementing a version of my teenage self:

protected override void OnStop()
{
    // What, me? Let me just snooze another 5 minutes...
    System.Threading.Thread.Sleep(300 * 1000);
}

This single statement will cause the executing thread to sleep for 5 minutes (300 * 1000 milliseconds), which should be more than enough for my friend to start ripping out his hair when in the middle of a short maintenance window, muhahahahaha!

Step #3 - Compile and install!

Now all we need is the compile the final executable and install it as a service on our test machine!

To do that, hit Ctrl + Shift + B (or go to BUILD -> Build Solution in the top menu). You can find the path to the executable in the Output window. The output path defaults to ${projectPath}\bin\${configProfile}\projectName.exe

To install and run our new service application, we can use the New-Service cmdlet in PowerShell (here targeting the debug version):

New-Service -Name "FaultyService" -BinaryPathName (Join-Path $vsProjectPath "bin\Debug\FaultyService.exe") -DisplayName "Your Favourite Faulty Service" -Description "This dummy service sleeps whenever inconvenient"

Conclusion

FaultyService is installed, all we need to do is start our new service:

Start-Service FaultyService

... and now my friend can test the timing bits and functionality of his new scripts with an actual running service without touching the real LOB application. Yatta!

To uninstall the service, you'll need to call out to sc.exe (the SCM cli), as there's no Remove-Service powershell cmdlet:

sc.exe delete FaultyService

I hope this showed how ridiculously easy writing a basic Windows Service application simulant can be - and I hope you'll extend it to your own needs!

For this purpose, I've published a cleaned up (but still very rudimentary) version of this dummy service on GitHub, that sleeps for a default of 90 seconds but takes the timeout in milliseconds as a process argument. Feel free to contribute by raising an issue or submit a PR!

← graceful is noforce