Powershell scripting has never been my favourite area to work in. Coming from a background of C# and C++, I’ve always found Powershell to be a bit hacky, not very rigorous and quite time consuming to write and test. Recently, I had need to multi thread some long running Powershell scripts and the results I got as well as the processes and frameworks I used to achieve them have completely changed my opinion of Powershell.
I was able to get a process that previously took up to one hour, to complete in less than 2 minutes. What I found was a system that is intuitive, robust and scales incredibly well. I never thought I’d see these kind of results using Powershell but I have been very happily surprised.
This article and example source code was originally published on code project.
I want to present a series of examples to demonstrate the main features of multi threaded Powershell. Some experience of Powershell as well an understanding of multi thread programming in general is assumed. All scripts were developed using Powershell 3. I have not tested them in any other version.
When I first started out on this work, I came up with three possible approaches to multi threading my Powershell scripts.
PowerShell Background Jobs
From MSDN – Cmdlets can perform their action internally or as a Windows PowerShell background job. When a cmdlet runs as a background job, the work is done asynchronously in its own thread separate from the pipeline thread that the cmdlet is using. From the user perspective, when a cmdlet runs as a background job, the command prompt returns immediately even if the job takes an extended amount of time to complete, and the user can continue without interruption while the job runs.
Powershell jobs are quite a high level construct – as such, there is limited control at the low level and limited ability to manage multiple threads and have them share variables, etc.
PowerShell workflows are a new concept in PowerShell based on the Windows Workflow Foundation engine. They support parallel processing out of the box. Again, like Jobs, they are quite high level and the amount of control they give us is limited. From Technet
.NET provides the System.Management.Automation.Runspaces namespace that gives us access to a set of classes designed to create, manipulate and orchestrate a pool of Powershell processes. This forms the basis of multi threading your Powershell scripts.
.NET Task Parallel Library
I had the idea to try to directly leverage the TPL from within Powershell and effectively tackle the problem in exactly the same way as one would if writing multi threaded code in .NET, e.g., instantiating Task objects, etc.
Background jobs and workflows didn’t provide me with enough control so I quickly dismissed them. My preference was to use the TPL but I quickly found that things didn’t quite work. Although we can write .NET code directly from within Powershell, that doesn’t mean we should try to follow the same patterns in both. They are both markedly different and at the thread level I found that trying to instantiate and manipulate threads from within a Powershell script was a recipe for disaster. That left me using the System.Management.Automation.Runspace namespace and the results were quite pleasing.
A First Simple Example
This is our first demonstration of a multi threaded Powershell script. We create 50 local text files by downloading a file from the web. We do it first sequentially and then in parallel and compare the results.
The sequential code should be self explanatory. When executing the process in parallel, the first step is to create a RunspacePool which hosts one or more Runspaces. A Runspace is an independent operating environment in which a Powershell process can run. For the purposes of our example, we can think of it as a thread. The RunspacePool will allow many Powershell processes to run concurrently and acts like our thread pool. We instantiate an instance of a PowerShell class and define what command(s) it will run and then execute it asynchronously using BeginInvoke.
We also introduce the idea of SessionState which allows us to share variables and more across all our Runspaces in the RunspacePool.
Try tweaking the value of $numThreads to see how using more or less threads affects performance on your system.
Locking is a standard technique in multithreaded programming that ensures only one thread at a time can access a shared resource (like a variable). This ensures consistent results and eliminates any contention.
In this example, we spin up 100 background processes that all try to update the same text file. Without any locking, many threads will fail as they will fail to achieve an exclusive lock on the file in question. With locking, we ensure only one thread can update the file at a time.
I am using the LockObject powershell module by David Wyatt here to implement the locking (you will have to download the lock object script and add it to the same folder location as this script)
If you run this, you should see some exceptions being thrown from the first process and eventually you will be left with a test file that has not been updated with all 100 entries that it should contain. With locking, there should be no errors and the file will contain all 100 entries.
Sharing Variables Across Threads
In this example, we instantiate an array and add it to SessionState, then we spin up two threads. The first takes the array and adds a random letter to it every second. While that process is running, we create another thread which outputs the values in the array every 1.5 seconds. We can see the second thread outputs values that were added to the array in the first thread, which shows that the array is indeed shared across both threads.
Note: I have taken no measures to ensure the process is thread safe here.
Return Data From a Background Thread
In this example, we will spin up 5 background threads that each do the simple job of concatenating two strings. Each background thread returns a custom object containing the original strings and the concatenated result. By calling EndInvoke, these custom objects are then returned to the calling script.
I hope that gave you some ideas about how you can multi thread some of your existing Powershell processes. Any feedback is welcome and if the tip helped you, then please leave a comment.