OBIEE - Densification with the fact-based fragmentation capabilities

> Reporting and Data Access Tools > Oracle Business Intelligence (OBIEE/Siebel Analytics) > OBIEE - BI Server (OBIS|nqsserver|nqs)

1 - About

You may want to view the data in dense form, with rows for all combination of dimension values displayed even when no fact data exist for them.

The preservation of the dimensions is is also well known as a densification .

The densification of the data is performed in Sql in two steps :

More information can be found in this article : Analytic - Data Densification (sparse to dense data) - Preservation of the dimensions

This article show you how to model the repository in order to densify your data :

  1. how to model a cross join
Advertising

3 - Sparse data to densify

A typical situation in the sh schema with a sparse dimension is shown in the following example, which computes the weekly sales and year-to-date sales for the product Bounce for weeks 20-30 in 2000 and 2001:

SELECT SUBSTR(p.Prod_Name,1,15) Product_Name, 
       t.Calendar_Year          YEAR,
       t.Calendar_Week_Number   Week, 
       SUM(Amount_Sold)         Sales
FROM   
       Sales s, 
       Times t, 
       Products p
WHERE  
       s.Time_id = t.Time_id AND 
       s.Prod_id = p.Prod_id AND
       p.Prod_name IN ('Bounce') AND 
       t.Calendar_Year IN (2000,2001) AND
       t.Calendar_Week_Number BETWEEN 20 AND 30
GROUP BY 
       p.Prod_Name, 
       t.Calendar_Year, 
       t.Calendar_Week_Number
ORDER BY 
       product_name, 
       YEAR, 
       week ASC
PRODUCT_NAME          YEAR       WEEK      SALES
--------------- ---------- ---------- ----------
Bounce                2000         20        801
Bounce                2000         21    4062.24
Bounce                2000         22    2043.16
Bounce                2000         23    2731.14
Bounce                2000         24    4419.36
---- Miss one row for the week 25
---- Miss one row for the week 26
Bounce                2000         27    2297.29
Bounce                2000         28    1443.13
Bounce                2000         29    1927.38
Bounce                2000         30    1927.38
Bounce                2001         20     1483.3
Bounce                2001         21    4184.49
Bounce                2001         22    2609.19
Bounce                2001         23    1416.95
Bounce                2001         24    3149.62
Bounce                2001         25    2645.98
---- Miss one row for the week 26
Bounce                2001         27    2125.12
---- Miss one row for the week 28
Bounce                2001         29    2467.92
Bounce                2001         30    2620.17

In this example, we would expect 22 rows of data (11 weeks each from 2 years) if the data were dense. However we get only 18 rows because weeks 25 and 26 are missing in 2000, and weeks 26 and 28 in 2001.

4 - The Cross Join between the dimensions

In Obiee which have a fact centric modelling, you must use a fact table to link two dimensions. Thus, a solution is to create a dummy fact table with no valid condition and to link it on the other dimensions.

To create the dummy Fact Cross join :

  • in the physical layer, select a folder, right-click and choose “New Physical Table”
  • name it FactCrossJoin
  • select the table type : select and enter this statement :
SELECT 'Yes'  Preserve FROM dual
  • in the column tab, create the column “Preserver” as varchar(10) as below

Then we must join this fact table with all the other dimensions :

  • in the physical layer, select the FactCrossJoin table and the dimensions
  • right click, Physical Diagram / Selected Object Only
  • click on the Complex join icon

  • and create for each dimension a join with a condition 1 = 1.

Then for the Business Model :

  • drag and drop the column “Preserve” from the physical layer to the logical fact table.
  • rename it to “Preserve Dimension”
  • change the aggregation rule to “Max” (Min is also good). The result will return always Yes and remark that the table FactCrossJoin is added in the logical table source list.

Don't forget to drag and drop the logical column “Preserve Dimension” to the presentation layer.

In OBIEE Answers, now when you select only this measure column and that you choose some attributes of the dimension, Obiee will perform a cross-join.

In our case, we will choose :

  • Prod Name
  • Calendar Year
  • Calendar Week Number

and make this selection :

  • Prod Name = Bounce
  • Calendar Year in 2001 and 2002
  • Calendar Week Number between 20 and 30

The answer result :

Obiee fired the SQL below with no relation between the table and ask then a cross join to the database.

SELECT T210.PROD_NAME AS c1,
     T268.CALENDAR_YEAR AS c2,
     T268.CALENDAR_WEEK_NUMBER AS c3,
     MAX(T4454.Preserve) AS c4,
     T210.PROD_ID AS c5
FROM 
     SH.TIMES T268,
     SH.PRODUCTS T210,
     (SELECT 'Yes'  Preserve FROM dual) T4454
WHERE  ( T210.PROD_NAME = 'Bounce' AND (T268.CALENDAR_YEAR IN (2000, 2001)) AND T268.CALENDAR_WEEK_NUMBER BETWEEN 20 AND 30 ) 
GROUP BY T210.PROD_ID, T210.PROD_NAME, T268.CALENDAR_WEEK_NUMBER, T268.CALENDAR_YEAR
ORDER BY c1, c2, c3, c5
Advertising

5 - The outer join with the fact vertical partitioning capabilities

The fact vertical partitioning occurs when :

  • you have two logical table source with the same level of content for the same logical table
  • minimum one measure of each logical source fact table are included in the query

Then the BI Server generates two queries for each logical table source and perform a full outer join between them.

In our situation, we have already two fact table :

  • Sales
  • FactCrossJoin

And we need then just to add the amount measure of the sales tables in our previous report and a dense report appear.

6 - The Sql

The Sql here is splited in two parts and the Obi Server perform in its own memory the full outer join on the attributes of the dimension but you can push the complete query to the database: OBIEE - How to control the use of the WITH CLAUSE and of STITCH Join ?

SELECT MAX(T4454.Preserve) AS c1,
     T210.PROD_NAME AS c2,
     T268.CALENDAR_YEAR AS c3,
     T268.CALENDAR_WEEK_NUMBER AS c4,
     T210.PROD_ID AS c5
FROM 
     SH.TIMES T268,
     SH.PRODUCTS T210,
     (SELECT 'Yes'  Preserve FROM dual) T4454
WHERE  ( T210.PROD_NAME = 'Bounce' AND (T268.CALENDAR_YEAR IN (2000, 2001)) AND T268.CALENDAR_WEEK_NUMBER BETWEEN 20 AND 30 ) 
GROUP BY T210.PROD_ID, T210.PROD_NAME, T268.CALENDAR_WEEK_NUMBER, T268.CALENDAR_YEAR
ORDER BY c5, c3, c4
SELECT SUM(T245.AMOUNT_SOLD) AS c1,
     T210.PROD_NAME AS c2,
     T268.CALENDAR_YEAR AS c3,
     T268.CALENDAR_WEEK_NUMBER AS c4,
     T210.PROD_ID AS c5
FROM 
     SH.TIMES T268,
     SH.PRODUCTS T210,
     SH.SALES T245
WHERE  ( T210.PROD_ID = T245.PROD_ID AND T210.PROD_NAME = 'Bounce' AND T245.TIME_ID = T268.TIME_ID AND 
(T268.CALENDAR_YEAR IN (2000, 2001)) AND T268.CALENDAR_WEEK_NUMBER BETWEEN 20 AND 30 ) 
GROUP BY T210.PROD_ID, T210.PROD_NAME, T268.CALENDAR_WEEK_NUMBER, T268.CALENDAR_YEAR
ORDER BY c5, c3, c4
Advertising

7 - How to hide the preservation column ?

You have three possibilities :

  • Application of the hidden column format as the system-wide default for these preservation measure
  • or creation of a filter with the condition :
Preserve Dimension is equal to / is in Yes
  • Add this column as an implicit fact column of the presentation catalog. If you set an implicit fact column this column will be added to a query when it contains columns from two or more dimension tables and no measures. The column is not visible in the results. It is used to specify a default join path between dimension tables when there are several possible alternatives.