Dimensional Data Modeling - Slowly Changing Dimensions (SCD)

Star Schema

About

Tracking changes in dimension is referred as slowly changing dimensions.

It contains data history

In the source system a lot of changes are daily made :

  • new customers are added,
  • addresses are modified,
  • new regional hierarchies are implemented,
  • or simply the product descriptions and packaging change.

These sorts of changes need to be reflected in the dimension tables and in several cases, the history of the changes also needs to be tracked.

By remembering history, we are then able to look at historical data and compare it to their current situation.

Techniques

Type 1 - Overwrite Original Value

A change does not require tracking

Type 2 - Add a new record

With Type II SCD, a new version of the dimension record is created, and the existing version is marked as history.

Each row does not correspond to a different instance of an entity but a different “state”, a “snapshot” of the instance at a point in time.

To accommodate this, extra metadata is required for the dimension table, including an effective date column and an expiration date column. These columns are used to differentiate a current version from a historical version as follows:

  • Effective date column stores the effective date of the version, also known as start date.
  • Expiration date column stores the expiration date of the version, also known as end date.
  • Expiration date value of the current version is always set to NULL or a default date value.

The user must identify the columns whose history will be tracked (by creating a new version) whenever their values are changed. These columns are known as trigger columns.

Scd Type2

Facts or type 2 dimensions may contain effective and expiration dates. It is common to leave them as dates/timestamps for the reason their ONLY use is to filter queries in a temporal context.

Type 3 - Add a new column to store the previous value

With Type III SCD, a current value field is created to keep the current value of dimension record apart from its previous value.

To accomplish this, two columns are created for each data field:

  • one storing the current value
  • and one storing the previous value, respectively.

Note

SCD2 permits a complete As-Was (past) and As-Is (current) rapport.

Tactical As-Is
Strategical As Was

Bitemporal data

The version of ISO (and ANSI) standard SQL (ISO 9075 SQL:2011) has a particular interest on the fact that this version of standard SQL includes support for:

  • System Versioned Tables (transaction time)
  • and Application Time Period Tables (valid time)

which together provide support for bitemporal data.

And these “temporal extensions” can be used to design an SCD type 2.

Documentation / Reference





Discover More
Consistent Hashing
Cryptography - Hash

A hash function is an encryption crypto algorithm that takes as data as input (possibly large and of variable-sized) and produces a short fixed-length integer value (generally printed as an hexadecimal...
Data System Architecture
Data - History (Versioning) - Historical Data

Data history (or versioning) is just the derivation of data state. It can be found in the following data structure: where a new version of data is created by committing changes. a serie of...
Etl Basic Functionnality
Data Integration - ETL evolves into Data Integration

Data integration suffers from an image problem. It has become synonymous with extract, transform and load. Likewise, ETL has been regarded as a data warehousing technology. Both of these viewpoints...
Datavault
Data Vault - Satellite

Satellite are tables in the data vault: that hold all the descriptive attribute and their history in a scd2 form A satellite can be created from all descriptive attributes of a table source such...
Data System Architecture
Data Warehouse

A data warehouse is a large central data repository of current, history and summarised data coming from operational and external sources used primarily for analysis. s is large historical databases for...
Data System Architecture
Data Warehouse - Persistent layer (Fundamental Layer | Operational Data Store - ODS)

A persistent layer is a data warehouse layer where data is persisted (ie never deleted). A persistent layer is also known as: a fundamental layer an operational data store (ODS) an enterprise...
Data System Architecture
Data Warehousing - 34 Kimball Subsytems

This page takes back the Kimball Datawarehouse 34 Subsystem as a table of content and links them to a page on this website....
Star Schema
Dimensional Data Modeling - Dimension (Perspective)

A dimension is a part of a dimensional schema and provide the basis for analyzing data through slicing and dicing A dimension is just a set of descriptif attribute that are generally grouped: in a...
Dimension Details Attributes
OWB - Dimension

A dimension in OWB is a metadata construction to be able to: automatically build a table with its hierarchy. manage the SCD behaviour When you have finished to define the hierarchy a bind (to a...
Card Puncher Data Processing
OWB - How to implement a type 2 slowly changing dimension with a hash function ?

Slowly changing dimension is the ability to track change on a record from a data set. How can we implement the type 2 to track the change and implement an history view. The type 2 : add a new record...



Share this page:
Follow us:
Task Runner