Dimensional Modeling - Measure / Fact
Table of Contents
1 - About
The term fact is used to represent a measure. In a dimensional model, you will find one measure per column in a fact table
The different types of measurement are categorized into four groups.
We cannot improve what we cannot measure, so measure we must. Theo Schlossnagle, Cironus, Time but Faster
2 - Articles Related
3 - A boolean is not a numerical fact but a discrete attribute
When I write a boolean, I talk about:
- A boolean
Really often the boolean are represented by the number 0/1 and are included in a fact table.
It's a bad design as it is not in essence a quantitative variable but a category of the representation of a status.
3.1 - Why ? because
3.1.1 - nothing is binary
You may meet them to represent a two states of a condition such as True/False. It implies the use of one threshold: above it's true and below it's false (of vice-versa).
And as you know, as the business changes this condition can also changes and you may get the need of a multiple status such as :
- Really True/Minder True/False
It's no more representative with 0 of 1 and you have to change the structure of your fact and to add them in a dimension.
3.1.2 - you need to filter on it
When you need to filter on something, it's also in essence an attribute.
In order to analyse only one group (the True or False Group), you have to pass the value to a filter. On a detail level of your fact table, it's not a problem.
But when you begin to aggregate it to gain in performance you will loose this information and are completely unable to for instance analyse the False or Bad rows.
4 - Calculated fact
A calculated fact should be store physically in the database because :
- The cost of a user incorrectly representing gross profit for instance overwhelms the minor incremental storage cost.
- Storing it also ensures that all users and reporting applications refer to profit consistently.
Likewise, some organization want to perform the calculation in the query tool. This works if all users access the data using a common tool (which is seldom the case in our experience).
5 - Conforming facts
Conforming facts means making agreements on common business metrics such as key performance indicators (KPIs) across separated databases so that these numbers can be compared mathematically for calculating differences and ratios.
6 - Additivity
Additivity is crucial because datawarehouse application almost never retrieve a single fact row. The most useful facts are numeric and additive, such as sales amount.
Kimball (1996) defines three levels of additivity.
6.1 - Fully Additive
A Measure fully additive can be aggregated across all dimensions.
6.2 - Semi-Additive
Semi-Additive facts can be added only among some of the dimension.
6.3 - Non-additive
Non Additive facts simply can't be added at all. In general, the measures are only valid for a combination of dimension level.
- and in general all calculations on one specific intersection of the dimension.
A trend for instance is only valid for one month. You can have a positive trend on one month but a negative one on one week.
7 - Null keys
referential integrity is violated if we put a null in a fact table declared as a foreign key to a dimension table. In addition to the referential integrity alarms, null keys are the source of great confusion to the users because they can't join on null keys.
Avoid null keys in the fact table.
A proper design includes a row in the corresponding dimension table to identify that the dimension is not applicable to the measurement.
8 - Aggregate - Level Based measure
You can have two types of measures :
- an aggregate measure.
The aggregate measure is the result of a summary on the fact table.
- a level-based measure
A level-based measure is a column whose values are always calculated to a specific level of aggregation. For example, quota, goals are only defined for a highest level than the finest grain in the fact table. They are always defined by country, by year, …
8.1 - Units of measure
Each measure must have its metrics. If it is a monetary measure, it may have a currency field and if it is a unit measure it may have a field to explain the kind of units used e.g. centimeters, litres, cubic metres etc.