Threading in C#

Joseph Albahari

Part 4
Advanced Topics

Non-Blocking Synchronization

Earlier, we said that the need for synchronization arises even the simple case of assigning or incrementing a field. Although locking can always satisfy this need, a contended lock means that a thread must block, suffering the overhead and latency of being temporarily descheduled. The .NET framework's non-blocking synchronization constructs can perform simple operations without ever blocking, pausing, or waiting. These involve using instructions that are strictly atomic, and instructing the compiler to use "volatile" read and write semantics. At times these constructs can also be simpler to use than locks.

Atomicity and Interlocked

A statement is atomic if it executes as a single indivisible instruction. Strict atomicity precludes any possibility of preemption. In C#, a simple read or assignment on a field of 32 bits or less is atomic (assuming a 32-bit CPU). Operations on larger fields are non-atomic, as are statements that combine more than one read/write operation:

class Atomicity {

  static int x, y;

  static long z;

  static void Test() {

    long myLocal;

    x = 3;             // Atomic

    z = 3;             // Non-atomic (z is 64 bits)

    myLocal = z;       // Non-atomic (z is 64 bits)

    y += x;            // Non-atomic (read AND write operation)

    x++;               // Non-atomic (read AND write operation)

Reading and writing 64-bit fields is non-atomic on 32-bit CPUs in the sense that two separate 32-bit memory locations are involved. If thread A reads a 64-bit value while thread B is updating it, thread A may end up with a bitwise combination of the old and new values.

Unary operators of the kind x++ require first reading a variable, then processing it, then writing it back. Consider the following class:

class ThreadUnsafe {

  static int x = 1000;

  static void Go () { for (int i = 0; i < 100; i++) x--; }

You might expect that if 10 threads concurrently ran Go, then x would end up 0. However this is not guaranteed, because it’s possible for one thread to preempt another in between retrieving x’s current value, decrementing it, and writing it back (resulting in an out-of-date value being written).

One way to solve to these problems is to wrap the non-atomic operations around a lock statement. Locking, in fact, simulates atomicity. The Interlocked class, however, provides a simpler and faster solution for simple atomic operations:

class Program {

  static long sum;

  static void Main() {                                            // sum

    // Simple increment/decrement operations:

    Interlocked.Increment (ref sum);                              // 1

    Interlocked.Decrement (ref sum);                              // 0

    // Add/subtract a value:

    Interlocked.Add (ref sum, 3);                                 // 3

    // Read a 64-bit field:

    Console.WriteLine (Interlocked.Read (ref sum));               // 3

    // Write a 64-bit field while reading previous value:

    // (This prints "3" while updating sum to 10)

    Console.WriteLine (Interlocked.Exchange (ref sum, 10));       // 10

    // Update a field only if it matches a certain value (10):

    Interlocked.CompareExchange (ref sum, 123, 10);               // 123

Using Interlocked is generally more efficient that obtaining a lock, because it can never block and suffer the overhead of its thread being temporarily descheduled.

Interlocked is also valid across multiple processes – in contrast to the lock statement, which is effective only across threads in the current process. An example of where this might be useful is in reading and writing into shared memory.

Memory Barriers and Volatility

Consider this class:

class Unsafe {

  static bool endIsNigh, repented;

  static void Main() {

    new Thread (Wait).Start();        // Start up the spinning waiter

    Thread.Sleep (1000);              // Give it a second to warm up!

    repented = true;

    endIsNigh = true;

    Console.WriteLine ("Going...");

  static void Wait() {

    while (!endIsNigh);               // Spin until endIsNigh

    Console.WriteLine ("Gone, " + repented);

Here's a question: can a significant delay separate "Going..." from "Gone" – in other words, is it possible for the Wait method to continue spinning in its while loop after the endIsNigh flag has been set to true? Furthermore, is it possible for the Wait method to write "Gone, false"?

The answer to both questions is, theoretically, yes, on a multi-processor machine, if the thread scheduler assigns the two threads different CPUs. The repented and endIsNigh fields can be cached in CPU registers to improve performance, with a potential delay before their updated values are written back to memory. And when the CPU registers are written back to memory, it’s not necessarily in the order they were originally updated.

This caching can be circumvented by using the static methods Thread.VolatileRead and Thread.VolatileWrite to read and write to the fields. VolatileRead means “read the latest value”; VolatileWrite means “write immediately to memory”. The same functionality can be achieved more elegantly by declaring the field with the volatile modifier:

class ThreadSafe {

  // Always use volatile read/write semantics:

  volatile static bool endIsNigh, repented;

...

If the volatile keyword is used in preference to the VolatileRead and VolatileWrite methods, one can think in the simplest terms, that is, "do not thread-cache this field!"

The same effect can be achieved by wrapping access to repented and endIsNigh in lock statements. This works because an (intended) side effect of locking is to create a memory barrier – a guarantee that the volatility of fields used within the lock statement will not extend outside the lock statement’s scope. In other words, the fields will be fresh on entering the lock (volatile read) and be written to memory before exiting the lock (volatile write).

Using a lock statement would in fact be necessary if we needed to access the fields end and endIsNigh atomically, for instance, to run something like this:

lock (locker) { if (endIsNigh) repented = true; }

A lock may also be preferable where a field is used many times in a loop (assuming the lock is held for the duration of the loop). While a volatile read/write beats a lock in performance, it's unlikely that a thousand volatile read/write operations would beat one lock!

Volatility is relevant only to primitive integral (and unsafe pointer) types – other types are not cached in CPU registers and cannot be declared with the volatile keyword. Volatile read and write semantics are applied automatically when fields are accessed via the Interlocked class.

If one has a policy always of accessing fields accessible by multiple threads in a lock statement, than volatile and Interlocked are unnecessary.

Wait and Pulse

Earlier we discussed Event Wait Handles – a simple signaling mechanism where a thread blocks until it receives notification from another.

A more powerful signaling construct is provided by the Monitor class, via two static methods – Wait and Pulse. The principle is that you write the signaling logic yourself using custom flags and fields (in conjunction with lock statements), then introduce Wait and Pulse commands to mitigate CPU spinning. This advantage of this low-level approach is that with just Wait, Pulse and the lock statement, you can achieve the functionality of AutoResetEvent, ManualResetEvent and Semaphore, as well as WaitHandle's static methods WaitAll and WaitAny. Furthermore, Wait and Pulse can be amenable in situations where all of the Wait Handles are parsimoniously challenged.

A problem with Wait and Pulse is their poor documentation – particularly with regard their reason-to-be. And to make matters worse, the Wait and Pulse methods have a peculiar aversion to dabblers: if you call on them without a full understanding, they will know – and will delight in seeking you out and tormenting you! Fortunately, there is a simple pattern one can follow that provides a fail-safe solution in every case.

Wait and Pulse Defined

The purpose of Wait and Pulse is to provide a simple signaling mechanism: Wait blocks until it receives notification from another thread; Pulse provides that notification.

Wait must execute before Pulse in order for the signal to work. If Pulse executes first, its pulse is lost, and the late waiter must wait for a fresh pulse, or remain forever blocked. This differs from the behavior of an AutoResetEvent, where its Set method has a "latching" effect and so is effective if called before WaitOne.

One must specify a synchronizing object when calling Wait or Pulse. If two threads use the same object, then they are able to signal each other. The synchronizing object must be locked prior to calling Wait or Pulse.

For example, if x has this declaration:

class Test {

  // Any reference-type object will work as a synchronizing object

  object x = new object();

then the following code blocks upon entering Monitor.Wait:

lock (x) Monitor.Wait (x);

The following code (if executed later on another thread) releases the blocked thread:

lock (x) Monitor.Pulse (x);

Lock toggling

To make this work, Monitor.Wait temporarily releases, or toggles the underlying lock while waiting, so another thread (such as the one performing the Pulse) can obtain it. The Wait method can be thought of as expanding into the following pseudo-code:

Monitor.Exit (x);             // Release the lock

wait for a pulse on x

Monitor.Enter (x);            // Regain the lock

Hence a Wait can block twice: once in waiting for a pulse, and again in regaining the exclusive lock. This also means that Pulse by itself does not fully unblock a waiter: only when the pulsing thread exits its lock statement can the waiter actually proceed.

Wait's lock toggling is effective regardless of the lock nesting level. If Wait is called inside two nested lock statements:

lock (x)

  lock (x)

    Monitor.Wait (x);

then Wait logically expands into the following:

Monitor.Exit (x); Monitor.Exit (x);    // Exit twice to release the lock

wait for a pulse on x

Monitor.Enter (x); Monitor.Enter (x);  // Restore previous nesting level

Consistent with normal locking semantics, only the first call to Monitor.Enter affords a blocking opportunity.

Why the lock?

Why have Wait and Pulse been designed such that they will only work within a lock? The primary reason is so that Wait can be called conditionally – without compromising thread-safety. To take a simple example, suppose we want to Wait only if a boolean field called available is false. The following code is thread-safe:

lock (x) {

  if (!available) Monitor.Wait (x);

  available = false;

Several threads could run this concurrently, and none could preempt another in between checking the available field and calling Monitor.Wait. The two statements are effectively atomic. A corresponding notifier would be similarly thread-safe:

lock (x)

  if (!available) {

    available = true;

    Monitor.Pulse (x);

Specifying a timeout

A timeout can be specified when calling Wait, either in milliseconds or as a TimeSpan. Wait then returns false if it gave up because of a timeout. The timeout applies only to the "waiting" phase (waiting for a pulse): a timed out Wait will still subsequently block in order to re-acquire the lock, no matter how long it takes. Here's an example:

lock (x) {

  if (!Monitor.Wait (x, TimeSpan.FromSeconds (10)))

    Console.WriteLine ("Couldn't wait!");

  Console.WriteLine ("But hey, I still have the lock on x!");

This rationale for this behavior is that in a well-designed Wait/Pulse application, the object on which one calls Wait and Pulse is locked just briefly. So re-acquiring the lock should be a near-instant operation.

Pulsing and acknowledgement

An important feature of Monitor.Pulse is that it executes asynchronously, meaning that it doesn't itself block or pause in any way. If another thread is waiting on the pulsed object, it's notified, otherwise the pulse has no effect and is silently ignored.

Pulse provides one-way communication: a pulsing thread signals a waiting thread. There is no intrinsic acknowledgment mechanism: Pulse does not return a value indicating whether or not its pulse was received. Furthermore, when a notifier pulses and releases its lock, there's no guarantee that an eligible waiter will kick into life immediately. There can be an arbitrary delay, at the discretion of the thread scheduler – during which time neither thread has the lock. This makes it difficult to know when a waiter has actually resumed, unless the waiter specifically acknowledges, for instance via a custom flag.

If reliable acknowledgement is required, it must be explicitly coded, usually via a flag in conjunction with another, reciprocal, Pulse and Wait.

Relying on timely action from a waiter with no custom acknowledgement mechanism counts as "messing" with Pulse and Wait. You'll lose!

Waiting queues and PulseAll

More than one thread can simultaneously Wait upon the same object – in which case a "waiting queue" forms behind the synchronizing object (this is distinct from the "ready queue" used for granting access to a lock). Each Pulse then releases a single thread at the head of the waiting-queue, so it can enter the ready-queue and re-acquire the lock. Think of it like an automatic car park: you queue first at the pay station to validate your ticket (the waiting queue); you queue again at the barrier gate to be let out (the ready queue).

Figure 2: Waiting Queue vs. Ready Queue

The order inherent in the queue structure, however, is often unimportant in Wait/Pulse applications, and in these cases it can be easier to imagine a "pool" of waiting threads. Each pulse, then, releases one waiting thread from the pool.

Monitor also provides a PulseAll method that releases the entire queue, or pool, of waiting threads in a one-fell swoop. The pulsed threads won't all start executing exactly at the same time, however, but rather in an orderly sequence, as each of their Wait statements tries to re-acquire the same lock. In effect, PulseAll moves threads from the waiting-queue to the ready-queue, so they can resume in an orderly fashion.

How to use Pulse and Wait

Here's how we start. Imagine there are two rules:

the only synchronization construct available is the lock statement, aka Monitor.Enter and Monitor.Exit
there are no restrictions on spinning the CPU!

With those rules in mind, let's take a simple example: a worker thread that pauses until it receives notification from the main thread:

class SimpleWaitPulse {

  bool go;

  object locker = new object();

  void Work() {

    Console.Write ("Waiting... ");

    lock (locker) {                        // Let's spin!

      while (!go) {

        // Release the lock so other threads can change the go flag

        Monitor.Exit (locker);

        // Regain the lock so we can re-test go in the while loop

        Monitor.Enter (locker);

    Console.WriteLine ("Notified!");

  void Notify()// called from another thread

    lock (locker) {

      Console.Write ("Notifying... ");

      go = true;

Here's a main method to set things in motion:

static void Main() {

  SimpleWaitPulse test = new SimpleWaitPulse();

  // Run the Work method on its own thread

  new Thread (test.Work).Start();            // "Waiting..."

  // Pause for a second, then notify the worker via our main thread:

  Thread.Sleep (1000);

  test.Notify();                 // "Notifying... Notified!"

The Work method is where we spin – extravagantly consuming CPU time by looping constantly until the go flag is true! In this loop we have to keep toggling the lock – releasing and re-acquiring it via Monitor's Exit and Enter methods – so that another thread running the Notify method can itself get the lock and modify the go flag. The shared go field must always be accessed from within a lock to avoid volatility issues (remember that all other synchronization constructs, such as the volatile keyword, are out of bounds in this stage of the design!)

The next step is to run this and test that it actually works. Here's the output from the test Main method:

Waiting... (pause) Notifying... Notified!

Now we can introduce Wait and Pulse. We do this by:

replacing lock toggling (Monitor.Exit followed by Monitor.Enter) with Monitor.Wait
inserting a call to Monitor.Pulse when a blocking condition is changed (i.e. the go flag is modified).

Here's the updated class, with the Console statements omitted for brevity:

class SimpleWaitPulse {

  bool go;

  object locker = new object();

  void Work() {

    lock (locker)

      while (!go) Monitor.Wait (locker);

  void Notify() {

    lock (locker) {

      go = true;

      Monitor.Pulse (locker);

The class behaves as it did before, but with the spinning eliminated. The Wait command implicitly performs the code we removed – Monitor.Exit followed by Monitor.Exit, but with one extra step in the middle: while the lock is released, it waits for another thread to call Pulse. The Notifier method does just this, after setting the go flag true. The job is done.

Pulse and Wait Generalized

Let's now expand the pattern. In the previous example, our blocking condition involved just one boolean field – the go flag. We could, in another scenario, require an additional flag set by the waiting thread to signal that's it's ready or complete. If we extrapolate by supposing there could be any number of fields involved in any number of blocking conditions, the program can be generalized into the following pseudo-code (in its spinning form):

class X {

  Blocking Fields:  one or more objects involved in blocking condition(s), eg

   bool go;   bool ready;   int semaphoreCount;   Queue <Task> consumerQ...

  object locker = new object();     // protects all the above fields!

  ... SomeMethod {

    ... whenever I want to BLOCK based on the blocking fields:

    lock (locker)

      while (! blocking fields to my liking ) {

        // Give other threads a chance to change blocking fields!

        Monitor.Exit (locker);

        Monitor.Enter (locker);

    ... whenever I want to ALTER one or more of the blocking fields:

    lock (locker) { alter blocking field(s) }

We then apply Pulse and Wait as we did before:

In the waiting loops, lock toggling is replaced with Monitor.Wait
Whenever a blocking condition is changed, Pulse is called before releasing the lock.

Here's the updated pseudo-code:

Wait/Pulse Boilerplate #1: Basic Wait/Pulse Usage

class X {
< Blocking Fields ... >
object locker = new object();

... SomeMethod {
    ...
    ... whenever I want to BLOCK based on the blocking fields:
    lock (locker)
      while (! blocking fields to my liking )
        Monitor.Wait (locker);

    ... whenever I want to ALTER one or more of the blocking fields:
    lock (locker) {
      alter blocking field(s)
      Monitor.Pulse (locker);
    }
}
}

This provides a robust pattern for using Wait and Pulse. Here are the key features to this pattern:

Blocking conditions are implemented using custom fields (capable of functioning without Wait and Pulse, albeit with spinning)
Wait is always called within a while loop that checks its blocking condition (itself within a lock statement)
A single synchronization object (in the example above, locker) is used for all Waits and Pulses, and to protect access to all objects involved in all blocking conditions
Locks are held only briefly

Most importantly, with this pattern, pulsing does not force a waiter to continue. Rather, it notifies a waiter that something has changed, advising it to re-check its blocking condition. The waiter then determines whether or not it should proceed (via another iteration of its while loop) – and not the pulser. The benefit of this approach is that it allows for sophisticated blocking conditions, without sophisticated synchronization logic.

Another benefit of this pattern is immunity to the effects of a missed pulse. A missed pulse happens when Pulse is called before Wait – perhaps due to a race between the notifier and waiter. But because in this pattern a pulse means "re-check your blocking condition" (and not "continue"), an early pulse can safely be ignored since the blocking condition is always checked before calling Wait, thanks to the while statement.

With this design, one can define multiple blocking fields, and have them partake in multiple blocking conditions, and yet still use a single synchronization object throughout (in our example, locker). This is usually better than having separate synchronization objects on which to lock, Pulse and Wait, in that one avoids the possibility of deadlock. Furthermore, with a single locking object, all blocking fields are read and written to as a unit, avoiding subtle atomicity errors. It's a good idea, however, not to use the synchronization object for purposes outside of the necessary scope (this can be assisted by declaring private the synchronization object, as well as all blocking fields).

Producer/Consumer Queue

A simple Wait/Pulse application is a producer-consumer queue – the structure we wrote earlier using an AutoResetEvent. A producer enqueues tasks (typically on the main thread), while one or more consumers running on worker threads pick off and execute the tasks one by one.

In this example, we'll use a string to represent a task. Our task queue then looks like this:

Queue<string> taskQ = new Queue<string>();

Because the queue will be used on multiple threads, we must wrap all statements that read or write to the queue in a lock. Here's how we enqueue a task:

lock (locker) {

  taskQ.Enqueue ("my task");

  Monitor.PulseAll (locker);   // We're altering a blocking condition

Because we're modifying a potential blocking condition, we must pulse. We call PulseAll rather than Pulse because we're going to allow for multiple consumers. More than one thread may be waiting.

We want the workers to block while there's nothing to do, in other words, when there are no items on the queue. Hence our blocking condition is taskQ.Count==0. Here's a Wait statement that performs exactly this:

lock (locker)

  while (taskQ.Count == 0) Monitor.Wait (locker);

The next step is for the worker to dequeue the task and execute it:

lock (locker)

  while (taskQ.Count == 0) Monitor.Wait (locker);

string task;

lock (locker)

  task = taskQ.Dequeue();

This logic, however, is not thread-safe: we've basing a decision to dequeue upon stale information – obtained in a prior lock statement. Consider what would happen if we started two consumer threads concurrently, with a single item already on the queue. It's possible that neither thread would enter the while loop to block – both seeing a single item on the queue. They'd both then attempt to dequeue the same item, throwing an exception in the second instance! To fix this, we simply hold the lock a bit longer – until we've finished interacting with the queue:

string task;

lock (locker) {

  while (taskQ.Count == 0) Monitor.Wait (locker);

  task = taskQ.Dequeue();

(We don't need to call Pulse after dequeuing, as no consumer can ever unblock by there being fewer items on the queue).

Once the task is dequeued, there's no further requirement to keep the lock. Releasing it at this point allows the consumer to perform a possibly time-consuming task without unnecessary blocking other threads.

Here's the complete program. As with the AutoResetEvent version, we enqueue a null task to signal a consumer to exit (after finishing any outstanding tasks). Because we're supporting multiple consumers, we must enqueue one null task per consumer to completely shut down the queue:

Wait/Pulse Boilerplate #2: Producer/Consumer Queue

using System;
using System.Threading;
using System.Collections.Generic;

public class TaskQueue : IDisposable {
object locker = new object();
Thread[] workers;
Queue<string> taskQ = new Queue<string>();

public TaskQueue (int workerCount) {
    workers = new Thread [workerCount];

    // Create and start a separate thread for each worker
    for (int i = 0; i < workerCount; i++)
      (workers [i] = new Thread (Consume)).Start();
}

public void Dispose() {
    // Enqueue one null task per worker to make each exit.
    foreach (Thread worker in workers) EnqueueTask (null);
    foreach (Thread worker in workers) worker.Join();
}

public void EnqueueTask (string task) {
    lock (locker) {
      taskQ.Enqueue (task);
      Monitor.PulseAll (locker);
    }
}

void Consume() {
    while (true) {
      string task;
      lock (locker) {
        while (taskQ.Count == 0) Monitor.Wait (locker);
        task = taskQ.Dequeue();
      }
      if (task == null) return;         // This signals our exit
      Console.Write (task);
      Thread.Sleep (1000);              // Simulate time-consuming task
    }
}
}

Here's a Main method that starts a task queue, specifying two concurrent consumer threads, and then enqueues ten tasks to be shared amongst the two consumers:

  static void Main() {

    using (TaskQueue q = new TaskQueue (2)) {

      for (int i = 0; i < 10; i++)

        q.EnqueueTask (" Task" + i);

      Console.WriteLine ("Enqueued 10 tasks");

      Console.WriteLine ("Waiting for tasks to complete...");

    // Exiting the using statement runs TaskQueue's Dispose method, which

    // shuts down the consumers, after all outstanding tasks are completed.

    Console.WriteLine ("\r\nAll tasks done!");

Enqueued 10 tasks
Waiting for tasks to complete...
Task1 Task0 (pause...) Task2 Task3 (pause...) Task4 Task5 (pause...)
Task6 Task7 (pause...) Task8 Task9 (pause...)
All tasks done!

Consistent with our design pattern, if we remove PulseAll and replace Wait with lock toggling, we'll get the same output.

Pulse Economy

Let's revisit the producer enqueuing a task:

lock (locker) {

  taskQ.Enqueue (task);

  Monitor.PulseAll (locker);

Strictly speaking, we could economize by pulsing only when there's a possibility of a freeing a blocked worker:

lock (locker) {

  taskQ.Enqueue (task);

  if (taskQ.Count <= workers.Length) Monitor.PulseAll (locker);

We'd be saving very little, though, since pulsing typically takes under a microsecond, and incurs no overhead on busy workers – since they ignore it anyway! It's a good policy with multi-threaded code to cull any unnecessary logic: an intermittent bug due to a silly mistake is a heavy price to pay for a one-microsecond saving! To demonstrate, this is all it would take to introduce an intermittent "stuck worker" bug that would most likely evade initial testing (spot the difference):

lock (locker) {

  taskQ.Enqueue (task);

  if (taskQ.Count < workers.Length) Monitor.PulseAll (locker);

Pulsing unconditionally protects us from this type of bug.

If in doubt, Pulse. Rarely can you go wrong by pulsing, within this design pattern.

Pulse or PulseAll?

This example comes with further pulse economy potential. After enqueuing a task, we could call Pulse instead of PulseAll and nothing would break.

Let's recap the difference: with Pulse, a maximum of one thread can awake (and re-check its while-loop blocking condition); with PulseAll, all waiting threads will awake (and re-check their blocking conditions). If we're enqueing a single task, only one worker can handle it, so we need only wake up one worker with a single Pulse. It's rather like having a class of sleeping children – if there's just one ice-cream there's no point in waking them all to queue for it!

In our example we start only two consumer threads, so we would have little to gain. But if we started ten consumers, we might benefit slightly in choosing Pulse over PulseAll. It would mean, though, that if we enqueued multiple tasks, we would need to Pulse multiple times. This can be done within a single lock statement, as follows:

lock (locker) {

  taskQ.Enqueue ("task 1");

  taskQ.Enqueue ("task 2");

  Monitor.Pulse (locker);    // "Signal up to two

  Monitor.Pulse (locker);    //  waiting threads."

The price of one Pulse too few is a stuck worker. This will usually manifest as an intermittent bug, because it will crop up only when a consumer is in a Waiting state. Hence one could extend the previous maxim "if in doubt, Pulse", to "if in doubt, PulseAll!"

A possible exception to the rule might arise if evaluating the blocking condition was unusually time-consuming.

Using Wait Timeouts

Sometimes it may be unreasonable or impossible to Pulse whenever an unblocking condition arises. An example might be if a blocking condition involves calling a method that derives information from periodically querying a database. If latency is not an issue, the solution is simple: one can specify a timeout when calling Wait, as follows:

lock (locker) {

  while ( blocking condition )

    Monitor.Wait (locker, timeout);

This forces the blocking condition to be re-checked, at a minimum, at a regular interval specified by the timeout, as well as immediately upon receiving a pulse. The simpler the blocking condition, the smaller the timeout can be without causing inefficiency.

The same system works equally well if the pulse is absent due to a bug in the program! It can be worth adding a timeout to all Wait commands in programs where synchronization is particularly complex – as an ultimate backup for obscure pulsing errors. It also provides a degree of bug-immunity if the program is modified later by someone not on the Pulse!

Races and Acknowledgement

Let's say we want a signal a worker five times in a row:

class Race {

  static object locker = new object();

  static bool go;

  static void Main() {

    new Thread (SaySomething).Start();

    for (int i = 0; i < 5; i++) {

      lock (locker) { go = true; Monitor.Pulse (locker); }

  static void SaySomething() {

    for (int i = 0; i < 5; i++) {

      lock (locker) {

        while (!go) Monitor.Wait (locker);

        go = false;

      Console.WriteLine ("Wassup?");

Expected Output:

Wassup?
Wassup?
Wassup?
Wassup?
Wassup?

Actual Output:

Wassup?
(hangs)

This program is flawed: the for loop in the main thread can free-wheel right through its five iterations any time the worker doesn't hold the lock. Possibly before the worker even starts! The Producer/Consumer example didn't suffer from this problem because if the main thread got ahead of the worker, each request would simply queue up. But in this case, we need the main thread to block at each iteration if the worker's still busy with a previous task.

A simple solution is for the main thread to wait after each cycle until the go flag is cleared by the worker. This, then, requires that the worker call Pulse after clearing the go flag:

class Acknowledged {

 static object locker = new object();

  static bool go;

  static void Main() {

    new Thread (SaySomething).Start();

    for (int i = 0; i < 5; i++) {

      lock (locker) { go = true; Monitor.Pulse (locker); }

      lock (locker) { while (go) Monitor.Wait (locker); }

  static void SaySomething() {

    for (int i = 0; i < 5; i++) {

      lock (locker) {

        while (!go) Monitor.Wait (locker);

        go = false; Monitor.Pulse (locker);   // Worker must Pulse

      Console.WriteLine ("Wassup?");

Wassup? (repeated five times)

An important feature of such a program is that the worker releases its lock before performing its potentially time-consuming job (this would happen in place of where we're calling Console.WriteLine). This ensures the instigator is not unduly blocked while the worker performs the task for which it has been signaled (and is blocked only if the worker is busy with a previous task).

In this example, only one thread (the main thread) signals the worker to perform a task. If multiple threads were to signal the worker – using our Main method's logic – we would come unstuck. Two signaling threads could each execute the following line of code in sequence:

  lock (locker) { go = true; Monitor.Pulse (locker); }

resulting in the second signal being lost if the worker didn't happen to have finish processing the first. We can make our design robust in this scenario by using a pair of flags – a "ready" flag as well as a "go" flag. The "ready" flag indicates that the worker is able to accept a fresh task; the "go" flag is an instruction to proceed, as before. This is analogous to a previous example that performed the same thing using two AutoResetEvents, except more extensible. Here's the pattern, refactored with instance fields:

Wait/Pulse Boilerplate #3: Two-way Signaling

public class Acknowledged {
object locker = new object();
bool ready;
bool go;

public void NotifyWhenReady() {
    lock (locker) {
      // Wait if the worker's already busy with a previous job
      while (!ready) Monitor.Wait (locker);
      ready = false;
      go = true;
      Monitor.PulseAll (locker);
    }
}

public void AcknowledgedWait() {
    // Indicate that we're ready to process a request
    lock (locker) { ready = true; Monitor.Pulse (locker); }

    lock (locker) {
      while (!go) Monitor.Wait (locker);      // Wait for a "go" signal
      go = false; Monitor.PulseAll (locker); // Acknowledge signal
    }

    Console.WriteLine ("Wassup?");            // Perform task
}
}

To demonstrate, we'll start two concurrent threads, each that will notify the worker five times. Meanwhile, the main thread will wait for ten notifications:

public class Test {

  static Acknowledged a = new Acknowledged();

 static void Main() {

    new Thread (Notify5).Start();     // Run two concurrent

    new Thread (Notify5).Start();     // notifiers...

    Wait10();                         // ... and one waiter.

  static void Notify5() {

    for (int i = 0; i < 5; i++)

      a.NotifyWhenReady();

  static void Wait10() {

    for (int i = 0; i < 10; i++)

      a.AcknowledgedWait();

Wassup?
Wassup?
Wassup?
(repeated ten times)

In the Notify method, the ready flag is cleared before exiting the lock statement. This is vitally important: it prevents two notifiers signaling sequentially without re-checking the flag. For the sake of simplicity, we also set the go flag and call PulseAll in the same lock statement – although we could just as well put this pair of statements in a separate lock and nothing would break.

Simulating Wait Handles

You might have noticed a pattern in the previous example: both waiting loops have the following structure:

lock (locker) {

  while (!flag) Monitor.Wait (locker);

  flag = false;

...

where flag is set to true in another thread. This is, in effect, mimicking an AutoResetEvent. If we omitted flag=false, we'd then have a ManualResetEvent. Using an integer field, Pulse and Wait can also be used to mimic a Semaphore. In fact the only Wait Handle we can't mimic with Pulse and Wait is a Mutex, since this functionality is provided by the lock statement.

Simulating the static methods that work across multiple Wait Handles is in most cases easy. The equivalent of calling WaitAll across multiple EventWaitHandles is nothing more than a blocking condition that incorporates all the flags used in place of the Wait Handles:

lock (locker) {

  while (!flag1 && !flag2 && !flag3...) Monitor.Wait (locker);

This can be particularly useful given that WaitAll is in most cases unusable due to COM legacy issues. Simulating WaitAny is simply a matter of replacing the && operator with the || operator.

SignalAndWait is trickier. Recall that this method signals one handle while waiting on another in an atomic operation. We have a situation analogous to a distributed database transaction – we need a two-phase commit! Assuming we wanted to signal flagA while waiting on flagB, we'd have to divide each flag into two, resulting in code that might look something like this:

lock (locker) {

  flagAphase1 = true;

  Monitor.Pulse (locker);

  while (!flagBphase1) Monitor.Wait (locker);

  flagAphase2 = true;

  Monitor.Pulse (locker);

  while (!flagBphase2) Monitor.Wait (locker);

perhaps with additional "rollback" logic to retract flagAphase1 if the first Wait statement threw an exception as a result of being interrupted or aborted. This is one situation where Wait Handles are way easier! True atomic signal-and-waiting, however, is actually an unusual requirement.

Wait Rendezvous

Just as WaitHandle.SignalAndWait can be used to rendezvous a pair of threads, so can Wait and Pulse. In the following example, one could say we simulate two ManualResetEvents (in other words, we define two boolean flags!) and then perform reciprocal signal-and-waiting by setting one flag while waiting for the other. In this case we don't need true atomicity in signal-and-waiting, so can avoid the need for a "two-phase commit". As long as we set our flag true and Wait in the same lock statement, the rendezvous will work:

class Rendezvous {

  static object locker = new object();

  static bool signal1, signal2;

  static void Main() {

    // Get each thread to sleep a random amount of time.

    Random r = new Random();

    new Thread (Mate).Start (r.Next (10000));

    Thread.Sleep (r.Next (10000));

    lock (locker) {

      signal1 = true;

      Monitor.Pulse (locker);

      while (!signal2) Monitor.Wait (locker);

    Console.Write ("Mate! ");

  // This is called via a ParameterizedThreadStart

  static void Mate (object delay) {

    Thread.Sleep ((int) delay);

    lock (locker) {

      signal2 = true;

      Monitor.Pulse (locker);

      while (!signal1) Monitor.Wait (locker);

    Console.Write ("Mate! ");

Mate! Mate! (almost in unison)

Wait and Pulse vs. Wait Handles

Because Wait and Pulse are the most flexible of the synchronization constructs, they can be used in almost any situation. Wait Handles, however, have two advantages:

they have the capability of working across multiple processes
they are simpler to understand, and harder to break

Additionally, Wait Handles are more interoperable in the sense that they can be passed around via method arguments. In thread pooling, this technique is usefully employed.

In terms of performance, Wait and Pulse have a slight edge, if one follows the suggested design pattern for waiting, that is:

lock (locker)

  while ( blocking condition ) Monitor.Wait (locker);

and the blocking condition happens to false from the outset. The only overhead then incurred is that of taking out the lock (tens of nanoseconds) versus the few microseconds it would take to call WaitHandle.WaitOne. Of course, this assumes the lock is uncontended; even the briefest lock contention would be more than enough to even things out; frequent lock contention would make a Wait Handle faster!

Given the potential for variation through different CPUs, operating systems, CLR versions, and program logic; and that in any case a few microseconds is unlikely to be of any consequence before a Wait statement, performance may be a dubious reason to choose Wait and Pulse over Wait Handles, or vice versa.

A sensible guideline is to use a Wait Handle where a particular construct lends itself naturally to the job, otherwise use Wait and Pulse.

Suspend and Resume

A thread can be explicitly suspended and resumed via the methods Thread.Suspend and Thread.Resume. This mechanism is completely separate to that of blocking discussed previously. Both systems are independent and operate in parallel.

A thread can suspend itself or another thread. Calling Suspend results in the thread briefly entering the SuspendRequested state, then upon reaching a point safe for garbage collection, it enters the Suspended state. From there, it can be resumed only via another thread that calls its Resume method. Resume will work only on a suspended thread, not a blocked thread.

From .NET 2.0, Suspend and Resume have been deprecated, their use discouraged because of the danger inherent in arbitrarily suspending another thread. If a thread holding a lock on a critical resource is suspended, the whole application (or computer) can deadlock. This is far more dangerous than calling Abort – which would result in any such locks being released – at least theoretically – by virtue of code in finally blocks.

It is, however, safe to call Suspend on the current thread – and in doing so one can implement a simple synchronization mechanism – with a worker thread in a loop – performing a task, calling Suspend on itself, then waiting to be resumed (“woken up”) by the main thread when another task is ready. The difficulty, though, is in testing whether or not the worker is suspended. Consider the following code:

worker.NextTask = "MowTheLawn";

if ((worker.ThreadState & ThreadState.Suspended) > 0)

  worker.Resume;

else

  // We cannot call Resume as the thread's already running.

  // Signal the worker with a flag instead:

  worker.AnotherTaskAwaits = true;

This is horribly thread-unsafe – the code could be preempted at any point in these five lines – during which the worker could march on in and change its state. While it can be worked around, the solution is more complex than the alternative – using a synchronization construct such as an AutoResetEvent or Monitor.Wait. This makes Suspend and Resume useless on all counts.

The deprecated Suspend and Resume methods have two modes – dangerous and useless!

Aborting Threads

A thread can be ended forcibly via the Abort method:

class Abort {

  static void Main() {

    Thread t = new Thread (delegate() {while(true);});   // Spin forever

    t.Start();

    Thread.Sleep (1000);        // Let it run for a second...

    t.Abort();                  // then abort it.

The thread upon being aborted immediately enters the AbortRequested state. If it then terminates as expected, it goes into the Stopped state. The caller can wait for this to happen by calling Join:

class Abort {

  static void Main() {

    Thread t = new Thread (delegate() { while (true); });

    Console.WriteLine (t.ThreadState);     // Unstarted

    t.Start();

    Thread.Sleep (1000);

    Console.WriteLine (t.ThreadState);     // Running

    t.Abort();

    Console.WriteLine (t.ThreadState);     // AbortRequested

    t.Join();

    Console.WriteLine (t.ThreadState);     // Stopped

Abort causes a ThreadAbortException to be thrown on the target thread, in most cases right where the thread's executing at the time. The thread being aborted can choose to handle the exception, but the exception then gets automatically re-thrown at the end of the catch block (to help ensure the thread, indeed, ends as expected). It is, however, possible to prevent the automatic re-throw by calling Thread.ResetAbort within the catch block. Then thread then re-enters the Running state (from which it can potentially be aborted again). In the following example, the worker thread comes back from the dead each time an Abort is attempted:

class Terminator {

  static void Main() {

    Thread t = new Thread (Work);

    t.Start();

    Thread.Sleep (1000); t.Abort();

    Thread.Sleep (1000); t.Abort();

    Thread.Sleep (1000); t.Abort();

  static void Work() {

    while (true) {

      try { while (true); }

      catch (ThreadAbortException) { Thread.ResetAbort(); }

      Console.WriteLine ("I will not die!");

ThreadAbortException is treated specially by the runtime, in that it doesn't cause the whole application to terminate if unhandled, unlike all other types of exception.

Abort will work on a thread in almost any state – running, blocked, suspended, or stopped. However if a suspended thread is aborted, a ThreadStateException is thrown – this time on the calling thread – and the abortion doesn't kick off until the thread is subsequently resumed. Here's how to abort a suspended thread:

try { suspendedThread.Abort(); }

catch (ThreadStateException) { suspendedThread.Resume(); }

// Now the suspendedThread will abort.

Complications with Thread.Abort

Assuming an aborted thread doesn't call ResetAbort, one might expect it to terminate fairly quickly. But as it happens, with a good lawyer the thread may remain on death row for quite some time! Here are a few factors that may keep it lingering in the AbortRequested state:

Static class constructors are never aborted part-way through (so as not to potentially poison the class for the remaining life of the application domain)
All catch/finally blocks are honored, and never aborted mid-stream
If the thread is executing unmanaged code when aborted, execution continues until the next managed code statement is reached

The last factor can be particularly troublesome, in that the .NET framework itself often calls unmanaged code, sometimes remaining there for long periods of time. An example might be when using a networking or database class. If the network resource or database server dies or is slow to respond, it's possible that execution could remain entirely within unmanaged code, for perhaps minutes, depending on the implementation of the class. In these cases, one certainly wouldn't want to Join the aborted thread – at least not without a timeout!

Aborting pure .NET code is less problematic, as long as try/finally blocks or using statements are incorporated to ensure proper cleanup takes place should a ThreadAbortException be thrown. However, even then, one can still be vulnerable to nasty surprises. For example, consider the following:

using (StreamWriter w = File.CreateText ("myfile.txt"))

  w.Write ("Abort-Safe?");

C#'s using statement is simply a syntactic shortcut, which in this case expands to the following:

StreamWriter w;

w = File.CreateText ("myfile.txt");

try     { w.Write ("Abort-Safe"); }

finally { w.Dispose();            }

It's possible for an Abort to fire after the StreamWriter is created, but before the try block begins. In fact, by digging into the IL, one can see that it's also possible for it to fire in between the StreamWriter being created and assigned to w:

IL_0001:  ldstr      "myfile.txt"

IL_0006:  call       class [mscorlib]System.IO.StreamWriter

                     [mscorlib]System.IO.File::CreateText(string)

IL_000b:  stloc.0

.try

...

Either way, the Dispose method in the finally block is circumvented, resulting in an abandoned open file handle – preventing any subsequent attempts to create myfile.txt until the application domain ends.

In reality, the situation in this example is worse still, because an Abort would most likely take place within the implementation of File.CreateText. This is referred to as opaque code – that which we don't have the source. Fortunately, .NET code is never truly opaque: we can again wheel in ILDASM, or better still, Lutz Roeder's Reflector – and looking into the framework's assemblies, see that it calls StreamWriter's constructor, which has the following logic:

public StreamWriter (string path, bool append, ...)

...

...

  Stream stream1 = StreamWriter.CreateFile (path, append);

  this.Init (stream1, ...);

Nowhere in this constructor is there a try/catch block, meaning that if the Abort fires anywhere within the (non-trivial) Init method, the newly created stream will be abandoned, with no way of closing the underlying file handle.

Because disassembling every required CLR call is obviously impractical, this raises the question on how one should go about writing an abort-friendly method. The most common workaround is not to abort another thread at all – but rather add a custom boolean field to the worker's class, signaling that it should abort. The worker checks the flag periodically, exiting gracefully if true. Ironically, the most graceful exit for the worker is by calling Abort on its own thread – although explicitly throwing an exception also works well. This ensures the thread's backed right out, while executing any catch/finally blocks – rather like calling Abort from another thread, except the exception is thrown only from designated places:

class ProLife {

  public static void Main() {

    RulyWorker w = new RulyWorker();

    Thread t = new Thread (w.Work);

    t.Start();

    Thread.Sleep (500);

    w.Abort();

  public class RulyWorker {

    // The volatile keyword ensures abort is not cached by a thread

    volatile bool abort;

    public void Abort() { abort = true; }

    public void Work() {

      while (true) {

        CheckAbort();

        // Do stuff...

        try      { OtherMethod(); }

        finally  { /* any required cleanup */ }

    void OtherMethod() {

      // Do stuff...

      CheckAbort();

    void CheckAbort() { if (abort) Thread.CurrentThread.Abort(); }

Calling Abort on one's own thread is one circumstance in which Abort is totally safe. Another is when you can be certain the thread you're aborting is in a particular section of code, usually by virtue of a synchronization mechanism such as a Wait Handle or Monitor.Wait. A third instance in which calling Abort is safe is when you subsequently tear down the thread's application domain or process.

Ending Application Domains

Another way to implement an abort-friendly worker is by having its thread run in its own application domain. After calling Abort, one simply tears down the application domain, thereby releasing any resources that were improperly disposed.

Strictly speaking, the first step – aborting the thread – is unnecessary, because when an application domain is unloaded, all threads executing code in that domain are automatically aborted. However, the disadvantage of relying on this behavior is that if the aborted threads don't exit in a timely fashion (perhaps due to code in finally blocks, or for other reasons discussed previously) the application domain will not unload, and a CannotUnloadAppDomainException will be thrown on the caller. For this reason, it's better to explicitly abort the worker thread, then call Join with some timeout (over which you have control) before unloading the application domain.

In the following example, the worker enters an infinite loop, creating and closing a file using the abort-unsafe File.CreateText method. The main thread then repeatedly starts and aborts workers. It usually fails within one or two iterations, with CreateText getting aborted part way through its internal implementation, leaving behind an abandoned open file handle:

using System;

using System.IO;

using System.Threading;

class Program {

  static void Main() {

    while (true) {

      Thread t = new Thread (Work);

      t.Start();

      Thread.Sleep (100);

      t.Abort();

      Console.WriteLine ("Aborted");

  static void Work() {

    while (true)

      using (StreamWriter w = File.CreateText ("myfile.txt")) { }

Aborted
Aborted
IOException: The process cannot access the file 'myfile.txt' because it
is being used by another process.

Here's the same program modified so the worker thread runs in its own application domain, which is unloaded after the thread is aborted. It runs perpetually without error, because unloading the application domain releases the abandoned file handle:

class Program {

  static void Main (string [] args) {

    while (true) {

      AppDomain ad = AppDomain.CreateDomain ("worker");

      Thread t = new Thread (delegate() { ad.DoCallBack (Work); });

      t.Start();

      Thread.Sleep (100);

      t.Abort();

      if (!t.Join (2000)) {

        // Thread won't end - here's where we could take further action,

        // if, indeed, there was anything we could do. Fortunately in

        // this case, we can expect the thread *always* to end.

      AppDomain.Unload (ad);            // Tear down the polluted domain!

      Console.WriteLine ("Aborted");

  static void Work() {

    while (true)

      using (StreamWriter w = File.CreateText ("myfile.txt")) { }

Aborted
Aborted
Aborted
Aborted
...
...

Creating and destroying an application domain is classed as relatively time-consuming in the world of threading activities (taking a few milliseconds) so it's something conducive to being done irregularly rather than in a loop! Also, the separation introduced by the application domain introduces another element that can be either of benefit or detriment, depending on what the multi-threaded program is setting out to achieve. In a unit-testing context, for instance, running threads on separate application domains can be of great benefit.

Ending Processes

Another way in which a thread can end is when the parent process terminates. One example of this is when a worker thread's IsBackground property is set to true, and the main thread finishes while the worker is still running. The background thread is unable to keep the application alive, and so the process terminates, taking the background thread with it.

When a thread terminates because of its parent process, it stops dead, and no finally blocks are executed.

The same situation arises when a user terminates an unresponsive application via the Windows Task Manager, or a process is killed programmatically via Process.Kill.

Post Comments