Easy Parallelization and Smooth Multitasking - ' When Parallelization Matters ' (
Page 2 of 4 )
In one of my recent DevSource articles, I described how calling delegates asynchronously can be the best way to pass typed parameters to a thread and to get back a typed result. In this article, I explore using asynchronous delegates to take advantage of multiple processors.
As long as there are no order-dependent side effects of processing each item, it is quite easy to take a simple iterative loop and parallelize it: the foreach body does an asynchronous BeginInvoke to process each item in its own thread, instead of processing each item synchronously. It is not much harder to use a Semaphore to match the number of active threads to the number of processors.
In the asynchronous delegates article, I used the example of running blocking code in a thread to reduce total run time. That is, you might know a filename some lines before you actually need its contents. You can calculate that filename as soon as possible, and immediately launch a thread to read its contents. You still have to wait milliseconds for the head to move and for the disk to spin, but the first thread might do enough other work before it had to block and wait for the file contents that the first thread's overall runtime is lower.
Somewhat similarly, you may need to download a set of files from multiple servers. Obviously, some files download faster than others. If you (synchronously) download the first file, then the second file, and so on to the last file, your total runtime is the sum of all the download times, plus some overhead. If you use asynchronous delegate calls to spawn a thread per file, your overhead is higher — you're making OS calls for synchronization, and using the ThreadPool is not free (even though it is cheaper than creating a new Thread) — but your run time may be cut to that higher overhead plus the single slowest download.
The ThreadPool will not create an unlimited number of threads, so you'll get the maximum speedup if the number of files you need to download (and hence the number of threads you spawn) is less than the system ThreadPool's thread limit. If you try to use the ThreadPool to run a delegate asynchronously when all of the pooled threads are already executing a delegate, your request is enqueued, and it does not actually execute until a delegate returns and thus returns a thread to the pool.
The queuing makes the total runtime less predictable. If the requests are ordered just right, the total time may still be just the slowest time plus some overhead. Yet, if there are enough requests and they are ordered in the wrong way, the total time may exceed the simple, synchronous scenario.