Friday, 22 January 2010

Manage c# threads easily using an array of BackgroundWorker class objects


I recently built a piece of software that processed files on a server.  There are hundreds of thousands of files, so this took a long time to run through, so I decided to use threads in order to process multiple files at once.

I wanted to be able to manage these threads, and be able to control how many threads I used by a programmatic variable so I could use more or less threads depending upon the power of the machine the process would be run on, and I could try different numbers of threads and benchmark the results to find the optimum number of threads.

The solution I came up with uses an array of .Net 2.0 BackgroundWorker class objects, initialised at run time to the size of the integer variable "maxThreads".

The following code sets up the variables and initialises the array:

static int maxThreads = 20;  //Make bigger or smaller, it's up to you!
private BackgroundWorker[] threadArray = new BackgroundWorker[maxThreads];
static int _numberBackGroundThreads ;  //Just for fun

// Set up the BackgroundWorker object by 
// attaching event handlers. 
private void InitializeBackgoundWorkers()
{
    for (int f = 0; f < maxThreads; f++)
    {
        threadArray[f] = new BackgroundWorker();
        threadArray[f].DoWork +=
            new DoWorkEventHandler(backgroundWorkerFiles_DoWork);
        threadArray[f].RunWorkerCompleted +=
            new RunWorkerCompletedEventHandler(backgroundWorkerFiles_RunWorkerCompleted);
        threadArray[f].ProgressChanged +=
            new ProgressChangedEventHandler(backgroundWorkerFiles_ProgressChanged);
        threadArray[f].WorkerReportsProgress = true;
        threadArray[f].WorkerSupportsCancellation = true;

    }
}


Each BackgroundWorker class has three event handlers assigned to it:
  1. backgroundWorkerFiles_DoWork - This delegate method is used to run the process
  2. backgroundWorkerFiles_RunWorkerCompleted - this delegate method is called once the "DoWork" method has completed
  3. backgroundWorkerFiles_ProgressChanged - this delegate method is used to pass information back to the calling thread, for example to report progress to the GUI thread.
This delegate methods are discussed at length on the MSDN site so I wont go into them in detail here other than to show you this simple code outline:
private void backgroundWorkerFiles_DoWork(object sender, DoWorkEventArgs e)
{
    //Just for fun - increment the count of the number of threads we are currently using.  Can show this number in the GUI.
    _numberBackGroundThreads --;
    
    // Get argument from DoWorkEventArgs argument.  Can use any type here with cast
    int myProcessArguments = (int)e.Argument;

    // "ProcessFile" is the name of my method that does the main work.  Replace with your own method!  
    // Can return reulsts from this method, i.e. a status (OK, FAIL etc)
    e.Result = ProcessFile(myProcessArgument);
}

private void backgroundWorkerFiles_ProgressChanged(object sender, ProgressChangedEventArgs e)
{
    // Use this method to report progress to GUI
}

private void backgroundWorkerFiles_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e)
{
    // First, handle the case where an exception was thrown.
    if (e.Error != null)
    {
        MessageBox.Show(e.Error.Message);
    }

    // For fun - print out the result of the ProcessFile() method.
    debug.print = e.Result.ToString();

    // Just for fun - decrement the count of threads
    _numberBackGroundThreads --;
}

OK now comes the fun part. The following code shows a loop thorugh a large number of items. Rather than run ProcessFile() for each item in the loop in turn, we instead choose an unused thread to run it in. This allows the loop to step onto the next item, which also is allocated an empty thread.
// Some process with many iterations
for(int f = 0; f < 100000; f++)
{

    //Use the thread array to process ech iteration
    //choose the first unused thread.
    bool fileProcessed = false;
    while (!fileProcessed)
    {
        for (int threadNum = 0; threadNum < maxThreads; threadNum++)
        {
            if (!threadArray[threadNum].IsBusy)
            {   // This thread is available
                Debug.Print("Starting thread: " + threadNum);
        
                //Call the "RunWorkerAsync()" method of the thread.  
                //This will call the delegate method "backgroundWorkerFiles_DoWork()" method defined above.  
                //The parameter passed (the loop counter "f") will be available through the delegate's argument "e" through the ".Argument" property.
                threadArray[threadNum].RunWorkerAsync(f);
                fileProcessed = true;
                break;
            }
        }
        //If all threads are being used, sleep awhile before checking again
        if (!fileProcessed)
        {
            Thread.Sleep(50);
        }
    }
}

Using this technique, it's eay to create 2, 10, or even 100 threads to process each loop iteration asychronously, which speeds up execution time enormously.  Simply change the value of "maxThreads" to whatever you need!
Seksy Watches on Yngoo!
Click here for the bestselling Seksy Watches and deals on Yngoo!

16 comments:

  1. Hi, this is just to thank you for the insight.

    I have to go through a xml index of files to download and this is just the thing I was looking for to speed up the process :D

    ReplyDelete
  2. I've tried and tried, but keep getting this error: "This BackgroundWorker is currently busy and cannot run multiple tasks concurrently."

    I've c/p your code and passing a string to my method. The only thing I've change is took your for(int f = 0; f < 100000; f++) loop out of the method.

    Can you please help?

    ReplyDelete
  3. Hi DJ - it sounds like you are somehow trying to re-use a busy thread. Have you tried stepping through your code in debug mode?

    This section is the logic that should ensure that you *never* reuse a busy thread, so check that your code is doing something similar:

    bool fileProcessed = false;
    while (!fileProcessed)
    {
    for (int threadNum = 0; threadNum < maxThreads; threadNum++)
    {
    if (!threadArray[threadNum].IsBusy)
    { // This thread is available
    Debug.Print("Starting thread: " + threadNum);

    //Call the "RunWorkerAsync()" method of the thread.
    //This will call the delegate method "backgroundWorkerFiles_DoWork()" method defined above.
    //The parameter passed (the loop counter "f") will be available through the delegate's argument "e" through the ".Argument" property.
    threadArray[threadNum].RunWorkerAsync(f);
    fileProcessed = true;
    break;
    }
    }
    //If all threads are being used, sleep awhile before checking again
    if (!fileProcessed)
    {
    Thread.Sleep(50);
    }
    }

    ReplyDelete
  4. how is it if i want to use a progressbar in each thread with that code??? i cant make it =/ ...anoter question is: do i need to declare a backgroundworker first named as "backgroundWorkerFiles" ???? im new at this sry

    ReplyDelete
  5. You use the :
    private void backgroundWorkerFiles_ProgressChanged(object sender, ProgressChangedEventArgs e)
    {
    // Use this method to report progress to GUI
    }

    method to write to teh GUI, so you put your progressbar code in here.

    My example above allows you to have as many threads as you have dimensions in the BackgroundWorker array - you don't need to declare anything else:)


    static int maxThreads = 20; //Make bigger or smaller, it's up to you!
    private BackgroundWorker[] threadArray = new BackgroundWorker[maxThreads];


    My example doesn't show you how to use the background worker, it assumes you already know a little about it, so I suggest you look at some other simple examples on the web and get a single background worker with a progress bar working, then come back to this and you can get hundreds of them working at once :)

    ReplyDelete
  6. thank you for the code and effort,

    i think there is an error in the declaration @
    static int _numberBackGroundThreads --; //Just for fun

    can you use -- in the declaration?!

    ReplyDelete
    Replies
    1. You are quite right that was a typo. I've now updated it :)

      Delete
  7. Thanks for this post. I was wondering if you experience a delay in connecting to the server when using this method to process files. I've modified the code for my own needs, queuing up HTTP requests to a server to do work, and when I check Fiddler, the HTTP requests don't actually appear until a short while, and then they all appear at once as a massive group. I was under the impression that as soon I tell a background worker to Run Async, it would go ahead and do its job and so the HTTP requests would be fired off ASAP. Am I wrong?

    ReplyDelete
    Replies
    1. I've not experienced this myself. Are you running it in debug mode? I always find the threads act peculiarly in that mode, probably for good reason. If so try compiling it fully to an .exe and running it and see what happens in Fiddler. Fiddler is a lifesaver isn't it.

      You can find more information about debugging multi-threaded applications here:
      http://msdn.microsoft.com/en-us/library/ms164746.aspx

      Delete
  8. I'm trying to get this to work and it works perfectly when you have less files than maxThreads, but as soon as you have more, the IsBusy flag never gets resolved... it's constantly busy, even though the threads are finishing...

    for (int f = 0; f < files.Count(); f++)
    {

    //Use the thread array to process each iteration
    //choose the first unused thread.
    bool fileProcessed = false;
    while (!fileProcessed)
    {
    for (int threadNum = 0; threadNum < maxThreads; threadNum++)
    {
    if (!threadArray[threadNum].IsBusy)
    { // This thread is available
    Debug.Print("Starting thread: " + threadNum);

    var command = new commandLine();
    options = replaceParameters(converterOptionsSetBox.Text, files[f], outFiletpeBox.Text, targetPath);
    command.Path = pathList[0];
    command.Command = (pathList[1] + " " + options);
    command.ThreadIndex = threadNum;
    //Call the "RunWorkerAsync()" method of the thread.
    //This will call the delegate method "backgroundWorkerFiles_DoWork()" method defined above.
    //The parameter passed (the loop counter "f") will be available through the delegate's argument "e" through the ".Argument" property.
    threadArray[threadNum].RunWorkerAsync(command);
    fileProcessed = true;
    break;
    }
    }
    //If all threads are being used, sleep awhile before checking again
    if (!fileProcessed)
    {
    Thread.Sleep(50);
    }
    }
    }

    command is an object that has multiple members so that I can pass in a couple of different data types...
    and I tried keeping track of the index for the thread and setting isBusy to false, when the command finishes, but nothing doing...it starts running, and then :

    Starting thread: 0
    Starting thread: 1
    Starting thread: 2
    Starting thread: 3
    Starting thread: 4
    Starting thread: 5
    Starting thread: 6
    Starting thread: 7
    Starting thread: 8
    Starting thread: 9
    The thread '' (0x2d28) has exited with code 0 (0x0).
    The thread '' (0x25d8) has exited with code 0 (0x0).
    The thread '' (0xf0c) has exited with code 0 (0x0).
    The thread '' (0x17c0) has exited with code 0 (0x0).
    The thread '' (0x2964) has exited with code 0 (0x0).
    The thread '' (0x2438) has exited with code 0 (0x0).
    The thread '' (0x374) has exited with code 0 (0x0).
    The thread '' (0xaa8) has exited with code 0 (0x0).

    Yet IsBusy, and IsRunning are still True, which causes the endless loop of while !fileProcessed mainly because it somehow never gets to
    backgroundWorkerFiles_RunWorkerCompleted
    I have a break point in there, and it never fires... any idea of where / how I can release the thread once it's done ?

    ReplyDelete
    Replies
    1. read the first answer in the following thread
      http://stackoverflow.com/questions/2183520/backgroundworkers-never-stop-being-busy

      Delete
  9. Working through it more I think my main issue is that Forms are STP (Single Threaded) and therefore this thread hangs in the while never looking at what happened in the event handlers. I tried setting a master worker thread that initiates the children so that if it hangs no big deal, the children are allready aware of the event handlers. but another exception to figure out.

    ReplyDelete
    Replies
    1. Hi Derek I have had a similar problem which turned out was down to the logic of my code trying to process nodes of a tree structure and getting "stuck" in child nodes that couldn't finish because the parent node already had finished! It took me a quite while to figure out. I think setting up the master thread will help you in you scenario so best of luck.

      Delete
  10. Excellent information, still works great 5 years later.

    ReplyDelete
    Replies
    1. Nice to hear! I must write some new posts but never seem to have the time, so hearing that this stuff helps people really encourages me to get on with it :)

      Delete