Database - What means "Data is dense/sparse" ?
Definition
Sparse
Data is normally stored in sparse form. If no value exists for a given combination of dimension values, no row exists in the fact table. For example, if not every product is sold in every market. In this case, Market and Product are sparse dimensions.
It's why in the reporting tool Obiee for instance, by default, data are considered sparse.
Dense
Most multidimensional databases may also contain dense dimensions. A fact table is considered to have dense data if it has (of a high probability to have) one row for every combination of its associated dimension levels.
Articles Related
Example of sparse data
A typical situation in the SH schema with a sparse dimension is shown in the following example, which computes the weekly sales and year-to-date sales for the product Bounce for weeks 20-30 in 2000 and 2001:
SELECT SUBSTR(p.Prod_Name,1,15) Product_Name, t.Calendar_Year Year, t.Calendar_Week_Number Week, SUM(Amount_Sold) Sales FROM Sales s, Times t, Products p WHERE s.Time_id = t.Time_id AND s.Prod_id = p.Prod_id AND p.Prod_name IN ('Bounce') AND t.Calendar_Year IN (2000,2001) AND t.Calendar_Week_Number BETWEEN 20 AND 30 GROUP BY p.Prod_Name, t.Calendar_Year, t.Calendar_Week_Number ORDER BY product_name, year, week ASC
PRODUCT_NAME YEAR WEEK SALES --------------- ---------- ---------- ---------- Bounce 2000 20 801 Bounce 2000 21 4062.24 Bounce 2000 22 2043.16 Bounce 2000 23 2731.14 Bounce 2000 24 4419.36 ---- Miss one row for the week 25 ---- Miss one row for the week 26 Bounce 2000 27 2297.29 Bounce 2000 28 1443.13 Bounce 2000 29 1927.38 Bounce 2000 30 1927.38 Bounce 2001 20 1483.3 Bounce 2001 21 4184.49 Bounce 2001 22 2609.19 Bounce 2001 23 1416.95 Bounce 2001 24 3149.62 Bounce 2001 25 2645.98 ---- Miss one row for the week 26 Bounce 2001 27 2125.12 ---- Miss one row for the week 28 Bounce 2001 29 2467.92 Bounce 2001 30 2620.17
In this example, we would expect 22 rows of data (11 weeks each from 2 years) if the data were dense. However we get only 18 rows because weeks 25 and 26 are missing in 2000, and weeks 26 and 28 in 2001.
Data Densification (sparse to dense data)
You may want to view the data in dense form, with rows for all combination of dimension values displayed even when no fact data exist for them.
For example, if a product did not sell during a particular time period, you may still want to see the product for that time period with zero sales value next to it. Continue to read ...