Task Parallel Library for efficiency in C#

What is the Task Parallel Library (TPL)?

The Task Parallel Library is a set of APIs and types provided by Microsoft that simplifies the creation and use of multi-threaded functions and applications. The TPL APIs are provided at a high level, so that developers do not have to deal with low-level issues such as task cancellation and scheduling.

What are the use cases for the Task Parallel Library (TPL)?

The Task Parallel Library has multiple use cases for dealing with task-based asynchronous programming, data parallelism and dataflows.

Task-Based Asynchronous Programming:

When you have CPU-bound tasks that can be broken into individual units of work, and are not interdependent, task-based parallelism can help reduce the time spent on processing. Microsoft has multiple examples of task-based patterns and features.

    var curriculumRecalculations = new List<Task<Curriculum>>();

    // given a set of calculations to run
    foreach (var predicate in calculations) {
             curriculumRecalculations.Add(Task.Run( () => { 
                // each task will do it's own set of work
                            predicate.Recalculate();
                    })
            );
        }

    // wait for all the tasks to finish
    Task.WaitAll(curriculumRecalculations.ToArray());

    // use the results
    foreach (var completedTask in curriculumRecalculations) {
        var result = completedTask.Result;
    }

Dataflows:

The TPL Dataflow library assists in reading and writing messages through an asynchronous data flow. This simplifies the creation of asynchronous Producer/Consumer patterns, in which a message producer forwards messages through a TPL message block for one or more consumers to consume. Using the Dataflow library, you can easily create pipeline or mesh style data flows.

    // Define a transformation to be applied to incoming data
    var removeHtmlEncoding = new TransformBlock<string, string>(data => {
            return HttpUtility.HtmlEncode(data);
        }
    );

    // Another transformation to happen after the HTML encoding is removed
    var makeUppercase = new TransformBlock<string, string>(data => {
            return data.ToUpper();
        }
    );

    // Link the two dataflow blocks together so that they happen in sequence
    var options = new DataflowLinkOptions { PropagateCompletion = true };
    makeUppercaseBlock.LinkTo(removeHtmlEncodingBlock, linkOptions);

    // Push data into the dataflow
    removeHtmlEncoding.Push(someHtmlEncodedString);

TPB dataflows can be modified on the fly, allowing addition or subtraction of dataflow blocks. Multiple types of Dataflow blocks are available to deal with transforming, buffering, batching and joining data.

When not to use Task Parallelism:

Although the task parallelism library makes parallelism easy to achieve in .Net, there are particular circumstances in which application of the TPL can lead on unintended consequences.

Short Task Time:

Task Parallelism should be avoided in cases where the data processing is very short; allocation and scheduling of the tasks adds overhead that may lead to a degradation in performance if your tasks are short.

Thread Affinity Environments:

Particular application structures are not good candidates for application of the TPL as they have limitations on thread use - either placing restrictions on access to the UI to a particular thread (in the case of Windows Forms or WPF), or are designed to handle requests in a single thread (such as the case in ASP.NET).

Non Thread Safe Environments:

Interactions with non thread safe objects also prevents the TPL from being applied effectively. Parallelism is not recommended in cases where the parallelism will cause concurrency issues, such as interacting with a datastore in a manner that is not thread-safe.

TPL Project Application:

A project I was working on required data about a user to be calculated - an intensive process that involved creations and evaluation of expression trees to determine course eligibility. The application sourced messages from a message queue in Azure - each message held the ID of a user to be processed. Depending on the number of parameters for a user, and the number of potential courses the user could be eligible for, recalculation quickly became a time intensive task, especially to run in a single thread. The application was published as a micro-service to an app cluster, so multiple separate workers existed to scale the application horizontally, yet as calculation of each message took up to a few seconds, the queue could hold several thousand messages in particularly high load times. Instead of scaling out further, the TPL afforded us a simple solution to make each node more efficient.

A couple of aspects of this particular case made it a great candidate for application of the Task Parallel Library:

  • The work is intensive, long running and CPU-bound.
  • Each task can be calculated separately and concurrently.
  • The legacy architecture of the worker did not utilize


Application of task-based asynchronous programming allowed service fabric workers to improve their efficiency in dealing with users that had multiple clients or an expansive curriculum. Each curriculum item was able to be evaluated as a task - when all tasks had completed, the results were persisted to the database. This improved overall turnaround time for message queue processing, and reduced the need to horizontally scale the service as the business needs increased - saving the client hundreds of dollars a month.

This article is my 6th oldest. It is 811 words long