Data - Workflow (DataFlow|Pipeline) - ETL

> (Data|State) Management and Processing > (Data Processing|Data Integration)

1 - About

General Purpose Workflow is Complicated.

See also Code Design - (Pipeline|Workflow|Compose|Chain)

The complexity with workflow is specifying

  • when user intervention is needed,
  • how to handle failures of an operation.

Transactions, retries, timeouts, all are features of a workflow system -along with parallel execution of operations, fault tolerant execution, and other needed features.


3 - Definitions

  • Actor - A person or program which performs some action.
  • Action - Something to be performed by an actor. Once an actor is notified that a given action is to be completed, they may perform it synchronously or asynchronously. It may take hours or days to complete the action.
  • Process Definition - A set of actions which need to be performed. The actions have a defined order in which they must be performed. Some actions may be performed concurrently with others.
  • Process - An instantiation of a process definition. Each process definition may have many processes running at once. A process definition can be compared to the on disk image of a program, where the process is comparable to an executing program (possibly with multiple threads of execution). Or from an OO perspective, a process definition is analogous to a class definition and a process is like an instantiated object of that class.
  • Workflow Engine - A program, library or API which can load process definitions and from them, generate and execute a processes.

Workflow is a label for systems which enable the building of process definitions and the execution of processes.

4 - Library / Tool

5 - Visualization

6 - Documentation / Reference