Data Warehousing - The Workload is always mixed

> (Data|State) Management and Processing > Data Warehouse

1 - Mixed workload Definition

Also known as:

  • active data warehousing,
  • operational data warehousing etc.

All this indicate something similar, namely a diverse workload running on a data warehouse system concurrently:

  • Whether it is continuous data loads while end users are querying data,
  • or whether it is short running, OLTP like activities (both queries and trickle data loads) mixed in with more classic ad-hoc data intensive query patterns

All of these are mixed workloads.

Any mixed workload environment will have statements with varying degrees of parallelism running. All of it needs to live within its means of the resources available on the entire system.

3 - What is my workload ?

To understand the workload for your given system you will need to gather information on the following main points:

  • Who is doing the work? - Which users are running workloads, which applications are running workloads?
  • What types of work are done on the system? - Are these workloads batch, ad-hoc, resource intensive (which resources) and are these mixed or separated in some form?
  • When are certain types being done? - Are there different workloads during different times of the day, are there different priorities during different time windows?
  • Where are performance problem areas? - Are there any specific issues in today's workload, why are these problems there, what is being done to fix these?
  • What are the priorities, and do they change during a time window? - Which of the various workloads are most important and when?
  • Are there priority conflicts? - And if so, who is going to make sure the right decisions are made on the real priorities?
Advertising

4 - Documentation / Reference

data/warehouse/workload.txt · Last modified: 2017/10/27 15:54 by gerardnico