Online Analytical Processing (Olap)

Data System Architecture

About

Online Analytical Processing (OLAP) refers to:

  • a class of application
  • an approach

to quickly answer multi-dimensional analytical queries.

The term OLAP was created as a slight modification of the traditional database term OLTP (Online Transaction Processing).

In the core of any OLAP system is a concept of an cube (also called a multidimensional cube or a hypercube). It consists of numeric facts called measures which are categorized by dimensions.

Aggregations

It has been claimed that for complex queries OLAP cubes can produce an answer in around 0.1% of the time for the same query on OLTP relational data. The most important mechanism in OLAP which allows it to achieve such performance is the use of aggregations (aggregate table). Aggregations are built from the fact table by changing the granularity on specific dimensions and aggregating up data along these dimensions. The number of possible aggregations is determined by every possible combination of dimension granularities.

Because usually there are many aggregations that can be calculated, often only a predetermined number are fully calculated; the remainder are solved on demand. The problem of deciding which aggregations (views) to calculate is known as the view selection problem. View selection can be constrained by the total size of the selected set of aggregations, the time to update them from changes in the base data, or both. The objective of view selection is typically to minimize the average time to answer OLAP queries, although some studies also minimize the update time. View selection is Algorithm - NP Complete (Nondeterministic Polynomial Time Complete). Many approaches to the problem have been explored, including greedy algorithms, randomized search, genetic algorithms and A* search algorithm.

A very effective way to support aggregation and other common OLAP operations is the use of bitmap indexes.

(Types|Storage Mode)

OLAP systems have been traditionally categorized using the following taxonomy.

MOLAP

MOLAP of Multidimensionnal (OLAP|Database)

MOLAP is the 'classic' form of OLAP and is sometimes referred to as just OLAP. MOLAP uses database structures that are generally optimal for attributes such as time period, location, product or account code. The way that each dimension will be aggregated is defined in advance by one or more hierarchies.

ROLAP

ROLAP of Relational OLAP

ROLAP works directly with relational databases. The base data and the dimension tables are stored as relational tables and new tables are created to hold the aggregated information. Depends on a specialized schema design. A typical implementation : the cube is mapped to a set of base tables in a relational DBMS modeled as star or snowflake schema.

HOLAP

HOLAP of Hybrid OLAP

There is no clear agreement across the industry as to what constitutes “Hybrid OLAP”, except that a database will divide data between relational and specialized storage. For example, for some vendors, a HOLAP database will use relational tables to hold the larger quantities of detailed data, and use specialized storage for at least some aspects of the smaller quantities of more-aggregate or less-detailed data.

Comparison

OLAP Type Benefits Disagreement
MOLAP generally delivers better performance Some MOLAP implementations are prone to database explosion
ROLAP generally more scalable query performance can suffer
HOLAP mix the best of ROLAP and MOLAP (pre-process quickly, scale well and offer good function support)

MOLAP Database Explosion

Database explosion is a phenomenon causing vast amounts of storage space to be used by MOLAP databases when certain common conditions are met: high number of dimensions, pre-calculated results and sparse multidimensional data. The typical mitigation technique for database explosion is not to materialize all the possible aggregation, but only the optimal subset of aggregations based on the desired performance vs. storage trade off.

Molap Beter Performance

MOLAP generally delivers better performance due to specialized indexing and storage optimizations. MOLAP also needs less storage space compared to ROLAP because the specialized storage typically includes compression techniques.

Rolap more scalable

ROLAP is generally more scalable. However, large volume pre-processing is difficult to implement efficiently so it is frequently skipped. ROLAP query performance can therefore suffer. Since ROLAP relies more on the database to perform calculations, it has more limitations in the specialized functions it can use.

Requirements for the MOLAP component

Requirements for the MOLAP component of a data warehouse or data mart strategy include:

  • The ability to scale to large volumes of data and large numbers of concurrent users
  • Consistent, fast query response times that allow for iterative speed-of-thought analysis
  • Integrated metadata that seamlessly links the OLAP server and the data warehouse relational database
  • The ability to automatically drill from summary and calculated data, which is managed by the OLAP server, to detail data stored in the data warehouse relational database
  • A calculation engine that includes robust mathematical functions for computing derived data (aggregations, matrix calculations, cross-dimensional calculations, OLAP-aware formulas and procedural calculations)
  • Seamless integration of historical, projected and derived data
  • A multi-user read/write environment to support users what-if analysis, modeling and planning requirements
  • The ability to be deployed quickly, adopted easily and maintained cost-effectively
  • Robust data-access security and user management
  • Availability of a wide variety of viewing and analysis tools to support different user communities

Neither personal-productivity tools nor query and reporting tools are designed to meet these requirements by themselves. ROLAP (relational OLAP) tools meet only one or two of these requirements.

Documentation / Reference





Discover More
Data System Architecture
(Data|State|Operand) Management and Processing

This section is and state management as opposed to code. System that manages data are called database. In a computer, there is two kinds of byte instruction byte and data byte. This section is...
Analytic Applications
Business Intelligence Application

is also known as : OLAP Application Enterprise Performance Management applications Analytic applications You may find also Bi Data Quality Application such as : Some BI applications are common...
Data System Architecture
Cube - Multi-dimensional query (OLAP query)

An multi-dimensional query is a query that runs against a multi-dimensional structure (ie cube). It is also known as: an OLAP query. a business query multi-dimensional query are analytics query...
Data System Architecture
Cube - Olap Database

An OLAP database is an olap application that provide a multidimensional view of the data and are designed to answer analytical questions such as “Why?” and “How?” An OLAP database will contains...
Data System Architecture
Data Analysis - Exploratory Data Analysis (EDA|Data exploration|Discovery)

Data exploration aims to have first look of the data. This is an interactive process (iterative) that is based on a serie of ad-hoc queries. The inverse of exploratory is Explanatory boxuancui/DataExplorer/DataExplorer...
Data System Architecture
Data Warehouse

A data warehouse is a large central data repository of current, history and summarised data coming from operational and external sources used primarily for analysis. s is large historical databases for...
Oltp Dwh
Data Warehousing - Contrasting OLTP and Data Warehousing Environments

One major difference between the types of system is that data warehouses are not usually in third normal form (3NF), a type of data normalization common in Online Transaction Processing (OLTP) environments....
Star Schema
Dimensional Data Modeling - Fact Table

A fact table is a table that stores fact where a fact can be seen as the representation of an event. It's also known as: Transactional history Unalterable fact. (The content of this kind of table...
Card Puncher Data Processing
Multi-Dimensional Analysis Server (MDAS)

The registers itself with the (CMS), and its services are available for consumption by other servers or client applications that communicate via the BusinessObjects Enterprise framework. The MDAS provides...
Obiee Dense Analytics By Members
OBIEE - Densification / Dimensions Preservation Possibilities

You may want to view the data in dense form, with rows for all combination of dimension values displayed even when no fact data exist for them. A MOLAP environment give you automatically this feature...



Share this page:
Follow us:
Task Runner