Oracle Database - Histogram Analytic

Card Puncher Data Processing

About

How to build an histogram in Oracle (and binning)

Function

equiwidth

You create equi-width (all bin have the same distance) histogram with the apps/search/search.jsp function.

WIDTH_BUCKET(column, min, max, bucket Number)

where:

  • column is a date or number column
  • From the min, the max and the bucket number the boundary of the buckets can be calculated.

Date Equi-width histogram Example where:

  • the 'dataset' with clause represents your sample
  • 7 is the number of day for your bucket
  • 20 is the number of bucket

The below query gives you a histogram result that you can plot in Excel for instance.

WITH
  dataset AS
  (
    SELECT
      myColumnName AS Analyzed_Column -- Date or numeric
    FROM
      myTable
    WHERE
      myDateFilter > sysdate - 7*20 -- If needed
  )
  ,
  MINMAX AS
  (
    SELECT
      MIN(Analyzed_Column) AS min_value,
      MAX(Analyzed_Column) AS Max_Value
    FROM
      dataset
  )
  ,
  buckets AS
  (
    SELECT
      Analyzed_Column,
      WIDTH_BUCKET(Analyzed_Column, MINMAX.min_value, MINMAX.max_value, 20) AS
      bucket
    FROM
      dataset,
      MINMAX
  )
  ,
  buckets_full AS
  (
    SELECT
      rownum AS bucket
    FROM
      (
        SELECT
          1 AS bucket_number
        FROM
          dual
          CONNECT BY LEVEL <= (20 + 1)
      )
  )
  ,
  buckets_analysis AS
  (
    SELECT
      bucket,
      MAX(Analyzed_Column) AS max_value,
      MIN(Analyzed_Column) AS min_value,
      COUNT(1)         AS COUNT
    FROM
      buckets
    GROUP BY
      bucket
  )
SELECT
  buckets_full.bucket,
  buckets_analysis.max_value,
  buckets_analysis.min_value,
  buckets_analysis.count
FROM
  buckets_full
LEFT OUTER JOIN buckets_analysis
ON
  buckets_full.bucket = buckets_analysis.bucket
order by buckets_full.bucket;

equiheight

apps/search/search.jsp: equiheight histograms. Each bin get the same amount of records. The bin boundary varies.

See Quantile - Percentile

Example of output for a sample of 40 rows

TILE ROW COUNT
1 10
2 10
3 10
4 10





Discover More
Histogram Height Balanced Uniform Distribution
Oracle Database - Statistics - Histogram (Column statistics)

Data Dictionary Column statistics may be stored as histograms. These histograms provide accurate estimates of the distribution of column data. Histograms provide improved selectivity estimates in the presence...



Share this page:
Follow us:
Task Runner