Data Processing - (Batch|Bulk) Processing

> (Data|State) Management and Processing > (Data Processing|Data Integration)

1 - About

An batch processing systems (bulk,offline) means:

  • starting a process,
  • reading a lot of data in batch (in parallel if possible)
  • and terminating the process

2 - Article


3 - Implementation

Simple code iterates generally one tuple at a time (for example looping over rows in a table). This kind of algorithms are hard to optimize and parallelize compared to declarative set-oriented languages such as SQL.