TPC-DS - dsqgen

> (Data|State) Management and Processing > (Data Type|Data Structure) > (Relation|Table) - Tabular data > Relation - Database Benchmark > TPC - DS

1 - About

dsqgen translates the query templates into fully functional SQL, which is known as executable query text (EQT).

dsqgen can create:

Advertising

3 - Dialect

Since some common SQL features do not have ANSI standard forms (e.g. “LIMIT” and “BEGIN/COMMIT”), the dsqgen utility must be told which “dialect” to use. The following “dialect templates” are supported:

  • db2.tpl,
  • netezza.tpl,
  • oracle.tpl,
  • sqlserver.tpl

4 - Syntax

The following information are mandatory:

  • a distribution file.
    • By default the tpcds.idx file (in the current working dir)
    • or the distribution option given at the command line
  • a dialect file.
    • By default, the ansi.tpl file (in query_templates_dir)
    • or the dialect option given at the command line
  • and either:
    • a TEMPLATE option for the generation of one SQL query
    • or INPUT for the generation of a SQL stream

Help on Windows:

dsqgen.exe /help
qgen2 Population Generator (Version 2.8.0)
Copyright Transaction Processing Performance Council (TPC) 2001 - 2018


USAGE: qgen2 [options]

Note: When defined in a parameter file (using -p), parmeters should
use the form below. Each option can also be set from the command
line, using a form of '/param [optional argument]'
Unique anchored substrings of options are also recognized, and
case is ignored, so '/sc' is equivalent to '/SCALE'

General Options
===============
FILE =  <s>              -- read parameters from file <s>
VERBOSE =  [Y|N]         -- enable verbose output
HELP =  [Y|N]            -- display this message
OUTPUT_DIR =  <s>        -- write query streams into directory <s>
QUIET =  [Y|N]           -- suppress all output (for scripting)
STREAMS =  <n>           -- generate <n> query streams/versions
INPUT =  <s>             -- read template names from <s>
SCALE =  <n>             -- assume a database of <n> GB
LOG =  <s>               -- write parameter log to <s>
QUALIFY =  [Y|N]         -- generate qualification queries in ascending order

Advanced Options
===============
DISTRIBUTIONS =  <s>     -- read distributions from file <s>
PATH_SEP =  <s>          -- use <s> to separate path elements
RNGSEED =  <n>           -- seed the RNG with <n>
RELEASE =  [Y|N]         -- display QGEN release info
TEMPLATE =  <s>          -- build queries from template <s> ONLY
COUNT =  <n>             -- generate <n> versions per stream (used with TEMPLATE)
DEBUG =  [Y|N]           -- minor debugging outptut
FILTER =  [Y|N]          -- write generated queries to stdout
DIALECT =  <s>           -- include query dialect defintions found in <s>.tpl
DIRECTORY =  <s>         -- look in <s> for templates
Advertising

5 - Example

5.1 - One Query

The following example generates a SQL file (named query_0.sql) from the query1 template for a 100GB database using Oracle syntax.

  • Linux
dsqgen `
  –template query1.tpl `
  –directory query_templates `
  –dialect oracle `
  –scale 100
  • Windows (With relative path)
dsqgen  ^
  /template "..\query_templates\query1.tpl" ^
  /scale 100 ^
  /directory "..\query_templates"

5.2 - Stream

6 - Documentation / Reference

  • See QGEN.doc in test directory