Oracle Database - ORA_HASH Function

Card Puncher Data Processing

Purpose

ORA_HASH is a hash function that computes a hash value for a given expression. This function is useful for operations such as analysing a subset of data and generating a random sample. The function returns a NUMBER value.

Syntax

ora_hash::=ORA_HASH(expr,max_bucket,seed_value)

where:

  • ora_hash is a NUMBER value.
  • The expr argument determines the data for which you want Oracle Database to compute a hash value. There are no restrictions on the type or length of data represented by expr, which commonly resolves to a column name.

The expr cannot be a LONG or LOB type. It cannot be a user-defined object type unless it is a nested table type.

  • The argument max_bucket is optional and it determines the maximum bucket value returned by the hash function. You can specify any value between 0 and 4294967295. The default is 4294967295.
  • The optional seed_value argument enables Oracle to produce many different results for the same set of data. Oracle applies the hash function to the combination of expr and seed_value. You can specify any value between 0 and 4294967295. The default is 0.

Joining Tables Via A Hash

CREATE TABLE t1 AS
SELECT owner, table_name, tablespace_name
FROM all_tables;

CREATE TABLE t2 AS
SELECT owner, table_name, tablespace_name
FROM all_indexes;

ALTER TABLE t1
ADD (hashcol NUMBER(38));

ALTER TABLE t2
ADD (hashcol NUMBER(38));

UPDATE t1
SET hashcol = ORA_HASH(owner || table_name ||  tablespace_name);

UPDATE t2
SET hashcol = ORA_HASH(owner || table_name ||  tablespace_name);

CREATE INDEX ix_t1_columns
ON t1 (owner, table_name, tablespace_name);

CREATE INDEX ix_t2_columns
ON t2 (owner, table_name, tablespace_name);

CREATE INDEX ix_t1_hash ON t1 (hashcol);

CREATE INDEX ix_t2_hash ON t2 (hashcol);

set linesize 121

SELECT * FROM t1
WHERE rownum < 101;

SELECT * FROM t2
WHERE rownum < 101;

--=========================================
set serveroutput on

DECLARE
 CURSOR rcur IS
 SELECT t1.table_name, t2.tablespace_name
 FROM t1, t2
 WHERE t1.table_name = t2.table_name
 AND t1.tablespace_name = t2.tablespace_name;

 CURSOR hcur IS
 SELECT t1.table_name, t2.tablespace_name
 FROM t1, t2
 WHERE t1.hashcol = t2.hashcol;

 n PLS_INTEGER;
BEGIN
  n := dbms_utility.get_time;
  FOR i IN 1..1000
  LOOP
    FOR rec IN rcur
    LOOP
      NULL;
    END LOOP;
  END LOOP;

  n := dbms_utility.get_time - n;
  dbms_output.put_line('w/o Hashing: ' || n);

  n := dbms_utility.get_time;
  FOR i IN 1..1000
  LOOP
    FOR rec IN hcur
    LOOP
      NULL;
    END LOOP;
  END LOOP;

  n := dbms_utility.get_time - n;
  dbms_output.put_line('w/ Hashing: ' || n);
END;
/

Result :

w/o Hashing: 1117
w/ Hashing: 1095

PL/SQL procedure successfully completed.


Random Sampling Data Via A Hash

The following example retrieves a subset of the data in the sh.sales table by specifying 10 buckets (0 to 9) and then returning the data from bucket 1. The expected subset is about 10% of the rows (the sales table has 960 rows):

SELECT * FROM sales WHERE ORA_HASH(cust_id, 9) = 1;

Reference





Discover More
Consistent Hashing
Cryptography - Hash

A hash function is an encryption crypto algorithm that takes as data as input (possibly large and of variable-sized) and produces a short fixed-length integer value (generally printed as an hexadecimal...
Card Puncher Data Processing
OWB - How to implement a type 2 slowly changing dimension with a hash function ?

Slowly changing dimension is the ability to track change on a record from a data set. How can we implement the type 2 to track the change and implement an history view. The type 2 : add a new record...



Share this page:
Follow us:
Task Runner