Dimensional Data Modeling - Hierarchy

Star Schema

About

Logical Data Modeling - Hierarchy Relationship in dimensional data modeling

A hierarchy is a set of levels having many-to-one relationships between each other, and the set of levels collectively makes up a dimension in the form of a tree (A tree shows a hierarchical relationship)

Each of the elements of a dimension could be summarized using a hierarchy. The hierarchy is a series of parent-child relationships, typically where a parent member represents the consolidation of the members which are its children. Parent members can be further aggregated as the children of another parent.

Example :

  • in Product Dimension, Individual unit roll up to brands. Brands roll up to categories, and categories roll up to departments.
  • in the time dimension : Fiscal Hierarchy: Fiscal Year > Fiscal Quarter > Fiscal Month > Fiscal Week > Day
  • in the region dimension : cities > regions > countries > global regions

The analyst could so start at a highly summarized level, such as the total difference between the actual results and the budget, and drill down into the cube to discover which locations, products and periods had produced this difference.

The hierarchical structure of dimensions provides the basis for analyzing data through drill down and roll up along its different levels.

A hierarchy describes parent-child relationships among a set of levels. All hierarchies must be strict 1:n relationships. One record in a parent level corresponds to multiple records in a child level. But one record in a child level corresponds to only one parent record within a hierarchy.

Hierarchies represent a declarative way for representing a 1:n relationship inside one table that the query rewrite capabilities from a database can use, for instance in Oracle with a materialized view query rewrite using join back and roll up

Hierarchy provide a way of classifying business events stored in the fact table, thereby reducing complexity.

You can find different sort of hierarchies :

Multiple hierarchies

It's not uncommon to represent multiple hierarchies in only one dimension table. Ideally, the attribute names and values should be unique across the multiple hierarchies. In this case, the schema is denormalized

Example :

  • in the time dimension : Fiscal Hierarchy: Fiscal Year > Fiscal Quarter > Fiscal Month > Fiscal Week > Day
  • in the time dimension : Calendar Hierarchy: Calendar Year > Calendar Quarter > Calendar Month > Day

Many-to-one relationships

A many-to-one relationship is where one entity (typically a column or set of columns) contains values that refer to another entity (a column or set of columns) that has unique values.

In relational databases, these many-to-one relationships are often enforced by foreign key/primary key relationships, and the relationships typically are between fact and dimension tables and between levels in a hierarchy.

This relationship is often used to describe classifications or groupings.

For example, in a geography schema having tables Region, State, and City, there are many states that are in a given region, but no states are in two regions. Similarly for cities, a city is in only one state (cities that have the same name but are in more than one state must be handled slightly differently). The key point is that each city exists in exactly one state, but a state may have many cities, hence the term “many-to-one.”

The different elements, or levels, of a hierarchy must have many-to-one relationships between children and parent levels, regardless of whether the hierarchy is physically represented in a star or snowflake schema; that is, the data must abide by these relationships. The clean data required to enforce the many-to-one relationships is an important characteristic of a dimensional schema.

The many-to-one relationships that define the hierarchy become levels in a dimension

Hierarchy type

Ragged / Skip-level

A ragged dimension is a particular type of unbalanced dimension where there are missing or duplicated levels in the hierarchy.

A hierarchy where some members do not have a value for a particular ancestor level.

For example, in the United States, the city of Washington in the District of Columbia does not belong to a state. The expectation is that users can still navigate from the country level (United States) to Washington and below without the need for a state.

In ragged hierarchies, the parent member of at least one member of a dimension is not at the level immediately above the member. Like unbalanced hierarchies, the branches of the hierarchies can descend to different levels.

In a ragged hierarchy, null values can appear in any of the level columns. Null column values between member names are skipped, so a parent can have a child member multiple levels below the parent level. The children of the USA shown in the example below are CA and Washington DC. The levels provide a meaningful context to its members in the ragged hierarchies. Thus, while Washington DC is a child of USA, it is included at the City level with San Francisco and Los Angeles.

Country State City
USA CA San Francisco
USA CA Los Angeles
USA NULL Washington DC
Vatican City NULL Vatican City

Unbalanced

An unbalanced dimension does not have the same number of levels in every branch.

A hierarchy where the leaves do not have the same depth. For example, an organization may choose to have data for the current month at the day level, data for the previous month's level, and data for the previous five years at the quarter level.

Recursive

When a dimension has a recursive hierarchy, you do not need to create any levels. Instead, you need to specify any required member information.

Hierarchy as a way to organize complexity

Hierarchy is one of the most common ways of organizing complexity for the purposes of human understanding, and is one of the basic principles of systems theory (Flood and Carson, 1993; Klir, 1985). Herbert Simon, the dual Nobel Prize winner, observed the use of hierarchy to organize complexity in organizational systems, social systems, biological systems, physical systems, and symbolic systems (Simon, 1996). Based on this, he proposed hierarchical organization as a general architecture for complexity. Hierarchical structures act as a complexity management mechanism by reducing the number of items one has to deal with at each level of the hierarchy.

For example, instead of having to deal with a million customers, one can group them into a hundred market sectors and then into 10 market segments. Hierarchical structures are a familiar and natural way of organizing information, and are all around us in everyday life—for example, books are organized hierarchically, Web sites are organized hierarchically, the organizations we work in are organized hierarchically—even this article is organized hierarchically. (but not the website)

1)





Discover More
Data System Architecture
(Tree|Nested Set|Hierarchy) Data Structure

A tree is a node that may have children. Tree's are inherently recursive by definition as each child of a node is a Tree itself, with or without children nodes. A tree is a special case of a graph structure...
Data System Architecture
Cube - Olap Database

An OLAP database is an olap application that provide a multidimensional view of the data and are designed to answer analytical questions such as “Why?” and “How?” An OLAP database will contains...
DOM - Hierarchy

The nodes in the node tree have a hierarchical relationship to each other. An element A is said to be preceded or followed by a second element B if A and B have the same parent node and there are...
Card Puncher Data Processing
Data Type - Hierarchy

Hierarchy relationship in data type. The hierarchy of type defines how the types are implicitly converted. Implicit conversion is allowed for types from child to an ancestor. Example of Hierarchy...
Datavault
Data Vault - Link

A Link table in the data vault methodology is: transactions, associations, hierarchies, and re-definition of business terms. Every table which has more than one foreign key is a candidate...
Utah Teapot
Data Visualization - Ring Charts

Can be used to display hierarchical trees.
Data System Architecture
Data Warehousing - 34 Kimball Subsytems

This page takes back the Kimball Datawarehouse 34 Subsystem as a table of content and links them to a page on this website....
Star Schema
Dimensional Data Modeling - Dimension (Perspective)

A dimension is a part of a dimensional schema and provide the basis for analyzing data through slicing and dicing A dimension is just a set of descriptif attribute that are generally grouped: in a...
Star Schema
Dimensional Data Modeling - Dimensional Schemas

This section is dimensional data modeling, That is the building of a cube (hypercube) A dimensional schema is a schema based on dimension. The dimensional schema can modeled: in a relational database...
Star Schema
Dimensional Data Modeling - Level

A level in a dimensional hierarchy is a set of attributes that has: an mandatory identifier attribute (known also as the business identifier) and descriptive attributes (optional) A level is...



Share this page:
Follow us:
Task Runner