Calcite - Query Cost

About

SQL Engine - (Query Plan) Cost in calcite.

Calcite applies a Cost based optimizer by default that is called the Volcano planner.

The cost is provided by the relational expression (relNode). See calculation

Cost is represented by RelOptCost
Cost typically includes
- rowcount
- IO
- CPU cost
Cost estimates are relative
Statistics are used to improve accuracy of cost estimations

Articles Related

Management

Print

The cost can be seen when you print the relational expression (ie explain) with all attributes.

Example from the getting started guide.

RelNode relNode = relRoot.project();
final RelWriter relWriter = new RelWriterImpl(new PrintWriter(System.out), SqlExplainLevel.ALL_ATTRIBUTES, false);
relNode.explain(relWriter);

LogicalSort(sort0=[$0], dir0=[ASC]): rowcount = 150.0, cumulative cost = {3668.75 rows, 11014.762352915506 cpu, 0.0 io}, id = 18
  LogicalProject(NAME=[$1], EXPR$1=[$2]): rowcount = 150.0, cumulative cost = {3518.75 rows, 5002.0 cpu, 0.0 io}, id = 17
    LogicalAggregate(group=[{0, 1}], EXPR$1=[COUNT()]): rowcount = 150.0, cumulative cost = {3368.75 rows, 4702.0 cpu, 0.0 io}, id = 15
      LogicalProject(deptno0=[$5], NAME=[$6], empid=[$0]): rowcount = 1500.0, cumulative cost = {3200.0 rows, 4702.0 cpu, 0.0 io}, id = 14
        LogicalJoin(condition=[=($1, $5)], joinType=[inner]): rowcount = 1500.0, cumulative cost = {1700.0 rows, 202.0 cpu, 0.0 io}, id = 12
          LogicalTableScan(table=[[HR, emps]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io}, id = 10
          LogicalTableScan(table=[[HR, depts]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io}, id = 11

Calculation

The cost is provided by the relational expression (relNode). Example with EnumerableHashJoin

@Override public RelOptCost computeSelfCost(RelOptPlanner planner,
      RelMetadataQuery mq) {
    double rowCount = mq.getRowCount(this);

    // Joins can be flipped, and for many algorithms, both versions are viable
    // and have the same cost. To make the results stable between versions of
    // the planner, make one of the versions slightly more expensive.
    switch (joinType) {
    case SEMI:
    case ANTI:
      // SEMI and ANTI join cannot be flipped
      break;
    case RIGHT:
      rowCount = RelMdUtil.addEpsilon(rowCount);
      break;
    default:
      if (RelNodes.COMPARATOR.compare(left, right) > 0) {
        rowCount = RelMdUtil.addEpsilon(rowCount);
      }
    }

    // Cheaper if the smaller number of rows is coming from the LHS.
    // Model this by adding L log L to the cost.
    final double rightRowCount = right.estimateRowCount(mq);
    final double leftRowCount = left.estimateRowCount(mq);
    if (Double.isInfinite(leftRowCount)) {
      rowCount = leftRowCount;
    } else {
      rowCount += Util.nLogN(leftRowCount);
    }
    if (Double.isInfinite(rightRowCount)) {
      rowCount = rightRowCount;
    } else {
      rowCount += rightRowCount;
    }
    if (isSemiJoin()) {
      return planner.getCostFactory().makeCost(rowCount, 0, 0).multiplyBy(.01d);
    } else {
      return planner.getCostFactory().makeCost(rowCount, 0, 0);
    }
  }

Documentation / Reference

introduction-to-apache-calcite - Page 95