About

Aggregate functions return a single value

from values that are in a aggregation relationship (ie a set)

This values are also known as summary because they try to summarize / describe a set of data

List

Computed

You compute generally over additive numeric data grouped by class attribute

Selection

Data Processing - Selection

  • FIRST() - Returns the first value
  • LAST() - Returns the last value
  • MAX() - Returns the largest value
  • MIN() - Returns the smallest value
  • Quantile - (Median|Middle) - Returns the median

Implementation

Mutative operation

Mutative accumulation for a sum

int sum = 0;
for (int x : numbers) {
   sum += x;
}

Reduction operation

They are implemented as reduction operation

Partition

Some operations like AVG are not partitionable. The computation can't therefore happens in parallel.

It is not valid to compute them on partitioned data because they are not commutative and associative.

You can still compute partial aggregates by transforming the non-commutative and associative function by commutative and associative function and, then roll them up.

Example: if:

  • you want to compute AVG(x)
  • you expand to SUM(x) / COUNT(x),
  • You compute partition SUM(x) and COUNT(X) on each partition
  • You sum them using SUM.

See Parallel Programming - (Function|Operation)