The Haunted House

Sunday, July 01, 2007

Pipeline Execution Pattern

Click here to view larger UML diagram.

This is the first version of a library for implementing a pipeline execution pattern. I originally wrote this library a couple of years ago and recently found it on one of my backup discs. So I dusted it off, tidied it up, and decided to release it as a code example.

This is only the first version and is by no means perfect; I have lots of ideas that can be added to this pattern to improve it.
The code is written in C# and the solution file is for Visual Studio 2005. If you want to run the unit tests you will need either NUnit, or the Resharper test runner.


A pipeline consists of a series of nodes that are registered with the system. Once the nodes are registered, you can assemble the nodes in any order to form a pipeline. Once the nodes are assembled into a pipeline, you call execute() and the nodes will be executed in order.

You might be wondering what the point of this is, but a pipeline allows you to break a complex computational task down into smaller discreet steps. Before you run the pipeline you can set some initial xml data to the pipeline that is passed in between the nodes as they execute. This essentially allows each of the nodes to talk to each other.

Pipeline Configuration

Via Code

A pipeline can be configured in 2 ways, through code, or via an xml config file. First we will discuss setting up a pipeline via code. To create a node you need to derive from the common Node base class.

internal class TestNode : Node
public TestNode(Pipeline pipeline)
: base(pipeline)

public override bool Verify(XmlDocument doc)
return true;

public override bool Execute(XmlDocument doc)
return true;

When you derive from the base class you much override 2 virtual functions, Verify, and Execute. When a pipeline executes, Verify will get executed first for a node. This give you a chance to do validation and integrity checks on the node, this might be examining the data passed into the node. If Verify fails you return false and the pipeline stops executing. If Verify returns true, the pipeline then calls Execute on the node. This is where you perform the actual node task.

To use your nodes with the pipeline you must first register the nodes. You do this as follows:

Pipeline pipeline = new Pipeline();
pipeline.RegisterNode("node1", new node1(pipeline));
pipeline.RegisterNode("node2", new node2(pipeline));
pipeline.RegisterNode("node3", new node3(pipeline));

You first construct an instance of the pipeline class, and then you create instances off you node. You must pass the pipeline instance into the node constructor. This gives the node access to the error reporting method of that pipeline instance.

Once a node has been constructed you must call RegisterNode on the instance of the pipeline. This makes that node available to the pipeline.

Now that you have registered your nodes, they are now available to be added into a pipeline. This is done a follows.


You will notice that node1 has been added twice. This is perfectly fine. A node can only be registered once, but it can be added to the pipeline as many times as neccessary.
To execute the pipeline you call:


Via XML Config file

You can also add nodes to a pipeline via a config file. At the moment you still need to register the nodes in code, but the config file will add the nodes to the pipeline. A future enhancement to this code will be for node registration via the config file. You load a config file as follows:


Look at the example.xml file included in the MSI package.

As you can see, inside the pipeline element we have a Nodes section. Within here individual Node tags specify the running order of the pipeline.

You will also notice we have parameters setup in the first node. This allows you to configure parameters that are available to the node. These parameters are stored in the NodeParams part of a node. To extract the parameters inside a node you do the following:

String tolerence = Params.Find("tolerence")

Passing Data through the pipeline.

When you create the pipeline you can add data that will get pass through the pipeline when it is executed. This data is available to each node to change as necessary. This data is passed into both the Verify method and Execute method as a standard .NET XmlDocument object. See the .NET documentation on how to extract and add data to an XmlDocument. Data is added to the pipeline as follows:


Object Cache

As well as passing an xml document to each node, a pipeline provides a central object cache where a node can store data to be retrieved by each node. This is useful if you want to store temporary data that should not be stored as part of the xml document. This also aids performance because serializing temp objects to xml can get slow if done a lot. A node can cache data like the following:

CacheObject("myobject", myObject);

Reporting Errors and Warnings

As each node executes you store a list of error and warning messages. When the pipeline finishes executing, either successfully or unsuccessfully, your program can present the list of errors and warnings to the user. For a node to add errors and warnings you do the following:

AddWarning(“Warning, something is not right.”);
AddError(“Error, something has gone wrong.”);


I hope you find this pattern and code useful. As I mentioned earlier I originally wrote this a few years ago so have only recently dug it out and updated it. I have more ideas I want to incorporate like:

- Node registration via config file.
- Nodes as separate plugin DLL’s which are automatically loaded by the pipeline manager.
- Node branching.

I recommend you load up the code and dive in, it’s not that complicated to find your way around. In the solution there is a test project that contains a load of NUnit unit tests. So if you want to fiddle with the code you will easily be able to tell if you have broken anything. The unit tests also make for good examples in how things are working. Have fun.

Download MSI Installer for Source Code



Post a Comment

<< Home