Dimensional Modeling - Surrogate Key (substitute primary key)

> (OLAP|Analytic) > Dimensional Modeling - Dimensional Schemas

1 - About

Main Entry: 1sur·ro·gate

Pronunciation: 's&r-&-“gAt, 's&-r&-

Function: transitive verb

Inflected Form(s): -gat·ed; -gat·ing

Etymology: Latin surrogatus, past participle of surrogare to choose in place of another, substitute, from sub- + rogare to ask – more at RIGHT Date: 1533 : to put in the place of another: a : to appoint as successor, deputy, or substitute for oneself b : SUBSTITUTE

It is a substitute primary key – for when you don't have a real one.

An ID column populated via a sequence is known as a surrogate key.

3 - DataWarehouse Use

The use of SKs in Datawarehouse environments is not only simplicity of star design or performance.

You often need to populate your dimensions with extra records not found in the source table, such as “UNKNOWN” or “NOT APPLIES” records. If you don't use SKs, What PK value would you use for this extra records?

How can you be sure that the values you choose won't collide with future records from the source table?

Also, taking your dimensional schemas away from possible source key changes is a good idea. Of course creating and maintaining SKs adds complexity and is extra work for ETL processes.