Data Processing - (Pipeline | Compose | Chain)

Card Puncher Data Processing

About

A pipeline is a finite automata where:

A pipeline creates a composition relationship.

A pipeline is also known as:

  • Compose
  • Chain (for instance Chain of command) - Daisy Chain ;)

A dataflow (data workflow) and pipeline are generally synonym
But a pipeline follows more a compositional structure (cascade of operations) whereas a data flow shows a directed graph structure (loop may be present for instance)

Type

Imperative

The pipeline is executed step by step

Declarative

The pipeline is executed only when the terminal operation is called.

All steps are building a composite type known as algebraic data type.

Example

Shell

In an OS Shell (Dos, bash), a serie of command connected by the pipe operators forms a pipeline. See Shell Data Processing - Pipeline

Code

By returning the calling object from a function, you can compose (or chain) functions. See Design Pattern - (Object) Builder. When we compose (chain) an operation, the output of one operation becomes the input for the next operation, and operations are applied from left to right.

Message / Queue

See pipe stream

MapReduce

MapReduce - Pipeline

Library

Documentation / Reference





Discover More
Cpu Moore Law Transistor
CPU - Pipeline (Cycle)

pipeline A computer (ie CPU) essentially implements this process: reads the instruction pointer, fetches the next instruction from a storage device decode the instruction execute it, increments...
Scale Counter Graph
Counter - Collector

Metrics collector query and collects metrics in order to be able to send them to a metrics server Log Collector In a instrumented application, reporter are a client piece of code which: process...
Data System Architecture
Data Concurrency - Producer Consumer Thread

Producer / Consumer is concurrency model (ie two threads/process communication) where: one thread called a Producer sends data and the other thread called the Consumer receive data. The data send...
Card Puncher Data Processing
Data Flow - Message (Operand)

This page talks message in the context of data processing. In data processing application, a Message is the data that are carried along the arcs of a pipeline. (ie the object traveling along the dataflow...
Card Puncher Data Processing
Data Processing - Operations / Operator

A data processing function takes an input and creates an output in a pipeline. transition in Automata functional interface in Functional Programming Filter in Data Processing (Shell and Log Pipeline)...
Data System Architecture
Data Warehousing - 34 Kimball Subsytems

This page takes back the Kimball Datawarehouse 34 Subsystem as a table of content and links them to a page on this website....
Card Puncher Data Processing
Dos - Pipeline

Pipelines (or Pipe) is a redirection operator that are used to chain the output of a command to the input of an other.. See
Relational Data Model
Functional Programming - Algebraic Data Type

An algebraic data type (Algebraic_data_type) is a data type that is the inputand the output of its own operations. An algebraic structure can be composed before being executed. This is a composite...
Card Puncher Data Processing
How to process data with a shell pipeline ?

This article shows you how to process data in the shell
Data System Architecture
LogStash

LogStash is: * a metrics collector * a log collector. * with pipeline ability A Logstash pipeline is composed of the following elements; * input (produce the data) * filter (optional, process...



Share this page:
Follow us:
Task Runner