TPC-DS - Load Test

Tpc Ds Data Flow

About

The Load Test is defined as all activity required to bring the System Under Test to the configuration that immediately precedes the beginning of the Performance Test.

The Load Test must not include the execution of any of the queries in the Power Test or Throughput Test or any similar query.

Data

Definition

Tpcds Row Count

by TPCDS - Scale factor (SF)

Data Generation

Serie

dsdgen -scale 1 -dir /tmp
dsdgen /scale 1 /dir c:\tmp

Parallel

Since dsdgen generates 200-300GB/hour serially on a 2-3GHz x86 processor, it is useful to run multiple parallel streams when generating large amounts of data.

Example:generating 1 GB with 4 parallel streams simultaneously

  • on Linux/Unix where ampersand is the background process character:
SCALE=1
TPCDS_DIR=/tmp/dsdgen/${SCALE}
mkdir -p ${TPCDS_DIR} 
dsdgen -scale ${SCALE} -f -dir ${TPCDS_DIR} -parallel 4 -child 1 & 	
dsdgen -scale ${SCALE} -f -dir ${TPCDS_DIR} -parallel 4 -child 2 &	
dsdgen -scale ${SCALE} -f -dir ${TPCDS_DIR} -parallel 4 -child 3 & 	
dsdgen -scale ${SCALE} -f -dir ${TPCDS_DIR} -parallel 4 -child 4 &
  • on Windows with 5 process
start /b dsdgen /scale 1 /parallel 5 /child 1
start /b dsdgen /scale 1 /parallel 5 /child 2
start /b dsdgen /scale 1 /parallel 5 /child 3
start /b dsdgen /scale 1 /parallel 5 /child 4
start /b dsdgen /scale 1 /parallel 5 /child 5

See also tpcds_home\tests\gen_base_data.sh

dbgen2 is TPC-DS - dsdgen

#!/bin/sh
# $id:$
# $log:$
cd temp_build
rm -rf /data/*.csv
child=1
while [ $child -le $DOP ]
do
  ./dbgen2 -f -dir /data -scale $SCALE -parallel $DOP -child $child > datagen_out.$child 2>&1 &
  child=`expr $child + 1`
done
wait
./dbgen2 -f -dir /data -scale $1 -update 1 > datagen_out.update 2>&1 

wc -l /data/*.csv |grep -i total > /tmp/results
diff -w ../linecount_${SCALE}.req /tmp/results > gen_base_data.out 
[ -s gen_base_data.out ] && exit -1
rm /tmp/results datagen_out.*





Discover More
Tpc Ds Data Flow
TPC - DS

TPC-DS was designed to be representative of a traditional report-based workload. TPC-DS models the decision support functions of a retail product supplier. TPC-DS does not benchmark...
Tpc Ds Data Flow
TPC-DS - dsdgen

dsdgen generate the data sets for the benchmark (initial and refresh data) dsdgen always needs and reads the tpcds.idx file from the current directory. Full: where: The default field delimiter...



Share this page:
Follow us:
Task Runner