OBIEE - Basic Discrete Distribution Report

1 - Purpose

This dashboard page describes basic statistical discrete distribution views of a selected population. It lets the user dynamically define the number of buckets to use for statistical distribution, as well as the grain in the population, and provides several dynamic representations in the results.

This report is useful to understand how the individuals of a population are distributed between the minimum and the maximum values, and to suggest the probabilities of where an individual may fall in a specific bucket. The report applies to numerous business situations:

  • distribution of order values,
  • distribution of call times,
  • distribution of salaries,
  • and so on.

It allows visualizing skewness of a given population versus typical distribution.

3 - Answer

3.1 - Bin

  • Basic formula structure:
FLOOR [ (Metric Value - MIN(Metric)) / (MAX(Metric)*1.0001-MIN(Metric)) * {NumBands} ] + 1 
  • Presentation Service formula structure
FLOOR( ((@{Met01}{Measures.default} ) - MIN(@{Met01}{Measures.Default)) / 
((MAX(@{Met01}{Measures.Default})*1.0001 - MIN(@{Met01}{Measures.Default})) * @{NumBands}{10})) + 1
  • Answers Aggregation rule: None
  • Description: This column calculates which bucket each individual falls in by its metric value. The number of desired buckets is specified with {NumBands} Presentation Variable. The calculation of the bucket for a given individual is the distance of this individual to the bottom value of the population, divided by the range of this population, multiplied by number of total buckets desired. The Floor expression rounds this value to an integer. The 1.0001 multiplier factor guarantees that the main ratio will never be equal to 1, to avoid having one extra bucket for single top individual.

3.2 - Bin Bounds

  • Basic formula structure:
[ Bin Number for Current Record * Range of the Metric / Total number of Bins ) + Value of Bottom Individual ]
   Concatenated with ' - ' and 
   [ (Bin Number for Current Record + 1) * Range of the Metric / Total number of Bins ) + Value of Bottom Individual ] 
  • Presentation Service formula:
CAST(CAST((
/* Bin Number for Current Record */
FLOOR(((@{Met01}{Measures.Default}) - MIN(@{Met01}{Measures.Default})) / 
( (MAX(@{Met01}{Measures.Default})*1.0001 - MIN(@{Met01}{Measures.Default}))  
/@{NumBands}{10}    ) )     
/* Range of Metrics */
* ((MAX(CAST(@{Met01}{Measures.Default} AS DOUBLE))*1.0001 - MIN(CAST(@{Met01}{Measures.Default} AS DOUBLE)))
/* Total Number of Bins */
/@{NumBands}{10}) 
/* Value of Bottom Individual */
+ MIN(CAST(@{Met01}{Measures.Default} AS DOUBLE)))/@{Denominator}{1} 
AS INT) AS VARCHAR(50)) 
/* Concatenated with */
||' - '||  
CAST(CAST((
/* Bin Number for Current Record + 1*/
FLOOR(((CAST(@{Met01}{Measures.Default} AS DOUBLE)) - MIN(CAST(@{Met01}{Measures.Default} AS DOUBLE)))    / 
( (MAX(@{Met01}{Measures.Default})*1.0001 - MIN(@{Met01}{Measures.Default}))  
/@{NumBands}{10}  ) +1)
/* Range of Metrics */
* ((MAX(CAST(@{Met01}{Measures.Default} AS DOUBLE))*1.0001 - MIN(CAST(@{Met01}{Measures.Default} AS DOUBLE)))
/* Total Number of Bins */
/@{NumBands}{10}) 
/* Value of Bottom Individual */
+ MIN(CAST(@{Met01}{Measures.Default} AS DOUBLE)))/@{Denominator}{1} 
AS INT) AS VARCHAR(50))
  • Answers Aggregation rule: Default
  • Description: This column converts the bin number in expression of the bound values for this bin. Ie, this bin starts at value X and ends at value Y. The interval X-Y should represent 1/ total number of bins * Range of metric value..

3.3 - Avg Value

  • Basic formula structure:
Metric value divided by denominator Presentation Variable 
  • Answers Aggregation rule: Average
  • Description: This column aggregates the metric values to show the average value

3.4 - Confidence

  • Basic formula structure:
From ((Quintile of metric )-1 ) * 20% To (Quintile of metric ) * 20% 
  • Presentation Service formula structure:
'From' || CAST ((ntile(@{Met01}{Measures.Default},5)-1)*20 AS VARCHAR (2))||' % to '|| CAST ((ntile(@{Met01}
{Measures.Default},5))*20 AS VARCHAR(3)) ||'% Interval'
  • Answers Aggregation rule: Default
  • Description: This indicates the probability that any random individual will fall below a given value in the range. There is 20% probability that a random record falls within the first quintile, ie, that is value be below the cap (max) value of first quintile. Similarly, there is 40% probability for any random individual to fall within first two quintiles, etc…

3.5 - Cum Value

  • Basic formula structure:
Rolling Sum (Metric) 
  • Presentation Service formula structure:
rsum(@{Met01}{Measures.Default}/@{Denominator}{1})
  • Answers Aggregation rule: Default
  • Description: This column cumulates a rolling sum of the metric value. It's used in the value distribution chart to display the cumulative value line.

3.6 - Individual Counts

  • Basic formula structure :
1 (constant value)
  • Answers Aggregation rule: Sum
  • Description: Counting the records in each bucket in the query

3.7 - Cum Count

  • Basic formula structure:
Rolling sum (1)
  • Answers Aggregation rule: Default
  • Description: Cumulates a rolling counter of the records. This column is used in the value distribution chart to display the cumulative value line.

3.8 - Pct Value

  • Basic formula structure:
Metric value / Sum(Metric Value) * 100
  • Answers Aggregation rule: Sum
  • Description: expresses the metric value as a percentage of the total value in the query population.

3.9 - Cum Pct Value

  • Basic formula structure:
Rolling sum [ Metric value / Sum(Metric Value) * 100 ]
  • Answers Aggregation rule: None
  • Description: rolling sum that cumulates metric value expressed as a percentage of the total value in the query population.

3.10 - Includes Avg

  • Basic formula structure:
Case when (distance between metric value for current record and Avg (metric)) is minimum, 
then mark all individuals in the bin with 0 else null.
  • Answer formula:
MAX(
CASE WHEN rank(-abs(avg(@{Met01}{Measures.Default} BY 1)-@{Met01}{Measures.Default})) = 1 THEN 0 END 
BY 
/* BIN */
(FLOOR(((CAST(@{Met01}{Measures.Default} AS DOUBLE)) - MIN(@{Met01}{Measures.Default})) / 
((MAX(@{Met01}{Measures.Default})*1.0001 - MIN(@{Met01}{Measures.Default}))/@{NumBands}{10})) + 1)
)
  • Answers Aggregation rule: Max
  • Description: This marker locates the bin where the closest individual to Average population value is located. This marker is then used in the table and chart views.

3.11 - Median

  • Basic formula structure:
Median (Metric)
  • Answer formula:
median(@{Met01}{Measure.Default})/@{Denominator}{1}
  • Answers Aggregation rule: Default
  • Description: Indicates the median value of the metric for the population.

3.12 - Skewness (Pearson's)

  • Basic formula structure:
3 * [ mean (metric) - median(metric) ]/ standard deviation(metric)
  • Answer Formula:
3 * ( AVG(@{Met01}{Measures.Default})- MEDIAN(@{Met01}{Measures.Default}) ) / stddev(@{Met01}{Measures.Default})
  • Answers Aggregation rule: Default
  • Description: Pearsons coefficient of skewness, orientation of the median relative to the quartiles. This coefficient measures the asymmetry of the distribution. When coefficient is positive (positive skew) the median is less than the mean and in that case the tail of the distribution is skewed to the right. When the median is more than the mean, the coefficient is negative and the tail of the distribution is skewed in the left direction. If there is no skewness or the distribution is symmetric like the bell-shaped normal curve then the mean equals the median which equals the mode, and the coefficient is close to zero.

3.13 - Range

  • Basic formula structure:
Max (Metric) - Min (Metric)
  • Answer formula:
(MAX (@{Met01}{Measures.default})-MIN (@{Met01}{Measures.default}))/@{Denominator}{1}
  • Answers Aggregation rule: Default
  • Description: Displays the range of the metric value for the whole population.

3.14 - Coeff of Variation

  • Basic formula structure:
Standard Deviation (Metric) / Mean (Metric)
  • Answer Formula:
stddev(@{Met01}{Measures.Default})/avg(@{Met01}{Measures.Default})*100
  • Answers Aggregation rule: Default
  • Description: coefficient of variation is a statistical measure of the dispersion of data points in a data series around the mean. it is a useful indicator for comparing the degree of variation from one data series to another, even if the means are drastically different from each other. Coefficient of variation can help determine how much volatility (risk) in the population.: The lower the ratio of standard deviation to mean return, the lower the risk and the better the risk-return tradeoff. Distributions with CV below 1 are considered low-variance, while those with CV > 1 are considered high-variance

3.15 - Bin Amplitude

  • Basic formula structure:
[Max (Metric) - Min (Metric)] / Number of Bins in the whole population
  • Answer Forumla:
(( MAX (@{Met01}{Measures.Default})-Min (@{Met01}{Measures.Measures.Default} )) / @{NumBands}{10})/@{Denominator}{1}
  • Answers Aggregation rule: Default
  • Description: This column divides the range by the number of bins desired in the distribution analysis, and indicates the value span of a single bin.

4 - Documentation / Reference

dat/obiee/obips/distribution.txt · Last modified: 2017/11/16 22:57 by gerardnico