Hartmann pipeline

From Wikipedia, the free encyclopedia

Pipelines
image:pipjarg1.jpeg
Paradigm: Dataflow programming
Appeared in: 1986
Designed by: John Hartmann
Developer: IBM
Latest release: 1.1.12 / 2005-11-04
Influenced by: Pipeline (Unix), APL
Website: http://vm.marist.edu/~pipeline

A Hartmann pipeline is an extension of the Unix pipeline concept, providing for more complex paths, multiple input/output streams, and other features. It is an example and extension of Pipeline programming.

A Hartmann pipe is a non-procedural representation of a solution of a data processing problem as a dataflow. The error-prone step of translating the dataflow to a traditional procedural programming language is eliminated. Hartmann pipelines may thus be considered as an executable specification language.

The concept was developed by John Hartmann, a Danish engineer with IBM. It is available as a software product CMS Pipelines for a number of IBM platforms.

Contents

[edit] Overview

A pipeline consists of a collection of stages, joined together by stage separators and connectors. Stages can be written in a variety of languages, and are either filters that process data records or device drivers (sources and sinks) that read data into or out of the pipeline. Unlike other implementations of pipeline programming, Hartmann's design has multiple streams in and out of each stage and can interconnect them non-sequentially. Unlike many programming languages, pipelines have a very small amount of notation, limited to stage separators (typically "|"), stream separators (typically ";" or "?"), and label separators (":"). Due to common usage, the diskread stage is also known as < and diskwrite as >, however all stages have names that are words in or make some sense in English.[1]

A simple example that reads a disk file, separates records containing the string "Hello" from those that do not, and writes both sets of records to different disk files can be written as:

< input.txt | A: locate /Hello/ | > found.txt ; A: | > notfound.txt

where the < stage reads the input disk file, the two > stages write the output disk files, and the locate stage separates the input stream into two output streams. locate's primary output is passed to the first > stage, and its secondary output is passed through the A: connector to the second > stage.

[edit] Features

Some of the salient characteristics that distinguish Hartmann Pipeline from ordinary Unix pipes are:

  • Filters may have multiple inputs and multiple outputs. For example, a selection filter can send the found records down one output pipe and the not found records down another.
  • A linear notation for representing pipeline networks.
  • An interface that allows REXX programs to act as stages.
  • A pacing strategy in the Pipeline supervisor that allows, for example, a stream to be split, say by a selection filter, and the records on the output legs to be processed by other filters, then merged by a join filter and have the record order preserved in result stream.
  • As implied by the previous item, data streams are (generally) not simply buffered and passed along to the next filter. The filters operate in parallel with input and output records handled by the Pipeline supervisor.

[edit] Similarity to APL

Programmers familiar with the APL programming language will see some similarities in Hartmann pipelines. It is obvious that the author was influenced by APL; some of the filters have names and functions similar to specific APL primitive functions. Examples include the TAKE filter, which passes a specified number of records, and the DEAL filter, which spreads its input records out across its output streams, in imitation of the APL deal operator.

As with APL, programmers adept in the use of pipelines will have their view of data processing problems and how they may be best solved fundamentally and permanently changed.

[edit] References

  1. ^ Melinda Varian (November 1995). "Plunging Into Pipes" (PDF). Retrieved on 2006-11-08.

[edit] External links