(Data Processing|Data Integration)

> (Data|State) Management and Processing > (Data Processing|Data Integration)

1 - About

Data processing is a more general term for manipulating data whereas data integration is the integration of data between two systems.


3 - Management

3.1 - Model

Data Integration has roughly two data processing model:

stream processing Send a message to another process, to be handled asynchronously
Processing that executes continuously as long as data is being produced
batch processing Periodically crunch a large amount of accumulated data
Processing that is executed and runs to completeness in a finite amount of time, releasing computing resources when finished

3.2 - Function

Data processing Function may involve various processes, including:

  • Validation with data type conversion – Ensuring that supplied data is “clean, correct and useful”. See also: Data Quality
  • Sorting – “arranging items in some sequence and/or in different sets.”
  • Summarization – reducing detail data to its main points.
  • Aggregation – combining multiple pieces of data.
  • Analysis – the “collection, organization, analysis, interpretation and presentation of data.”.
  • Reporting / visualization – list detail or summary data or computed information.
  • Classification – separates data into various categories.
A woman using a keypunch to tabulate the United States Census, circa 1940:

[[https://commons.m.wikimedia.org/wiki/File:Card_puncher_-_NARA_-_513295.jpg| ]]


4 - Others

4.1 - Goal

  • The “360 degree view of the enterprise” is a commonly discussed goal that really means data integration. ??

4.2 - Term

  • ETL : Extraction, Transformation and Load Software
  • ELT : Extraction, Load and Transformation Software

4.3 - Magic Quadrant

5 - Documentation / Reference

data/processing/processing.txt · Last modified: 2019/03/27 10:00 by gerardnico