TPC-DS - Query

Tpc Ds Data Flow

About

A query is an ordered set of one or more valid SQL statements resulting from applying the required parameter substitutions to a given query template.

The order of the SQL statements is defined in the query template.

The mechanism used to submit queries to the SUT and to measure their execution time is called a driver.

They model long running and multi-part queries where the DBA can assume that the data processing system is quiescent for queries during any particular period

The users and queries modeled by the benchmark exhibit the following characteristics:

  • They address complex business problems
  • They use a variety of access patterns, query phrasings, operators, and answer set constraints
  • They employ query parameters that change across query executions

Property

Each query is described by the following components:

  • a) A business question, which illustrates the business context in which the query could be used. The business questions are listed in Appendix B.
  • b) The functional query definition, as specified in the TPC-supplied query template (see Clause 4.1.2 for a discussion of Functional Query Definitions)
  • c) The substitution parameters, which describe the substitution values needed to generate the executable query text
  • d) The answer set

Comment:

  • Some functional query definitions include a limit on the number of rows to be returned by the query. These limits are omitted from the business question.
  • In cases where the business question does not accurately describe the functional query definition, the latter will prevail.

Class

Query class are divided by the schema.

  • The catalog sales channel is dedicated for the reporting part,
  • The store and web channels are dedicated for the ad-hoc part.

Queries accessing the ad-hoc part constitute the ad-hoc query set while the queries accessing the reporting part are considered the reporting queries.

Notes:

  • The catalog sales channel was chosen as the reporting part because its data accounts for 40% of the entire data set.
  • The reporting and ad-hoc parts of the schema differ in what kind of auxiliary data structures can be created..

TPC-DS has defined four broad classes of queries that characterize most decision support queries:

  • Reporting queries
  • Ad hoc queries
  • Iterative OLAP queries
  • Data mining queries





Discover More
Tpc Ds Data Flow
Substitution Parameter

The is a query property that describes the substitution values needed to generate the executable query text within Each query has one or more substitution parameters. Dsqgen must be used to generate...
Tpc Ds Data Flow
TPC - DS

TPC-DS was designed to be representative of a traditional report-based workload. TPC-DS models the decision support functions of a retail product supplier. TPC-DS does not benchmark...
Tpc Ds Data Flow
TPC-DS - Answer Set

An answer set is the result of a query and is used in query validation See : tpcds_home\answer_sets\
Tpc Ds Data Flow
TPC-DS - Driver

The mechanism used to submit queries to the SUT and to measure their execution time is called a driver. The driver is a logical entity that can be implemented using one or more physical programs, processes,...
Tpc Ds Data Flow
TPC-DS - Query Stream

A query stream is defined as the sequential execution of a permutation of queries submitted by a single emulated user. Ie this is a file containing multiple queries. See with Linux Windows:...
Tpc Ds Data Flow
TPC-DS - Query Validation

query validation (see Clause 7.3)
Tpc Ds Data Flow
TPC-DS - executable query text (EQT)

An executable query text (EQT) is a query that: is created with the dsqgen tool from a query template file can be executed against a database



Share this page:
Follow us:
Task Runner