Oracle Database - ORA_HASH Function
Purpose
ORA_HASH is a hash function that computes a hash value for a given expression. This function is useful for operations such as analysing a subset of data and generating a random sample. The function returns a NUMBER value.
Articles Related
Syntax
ora_hash::=ORA_HASH(expr,max_bucket,seed_value)
where:
- ora_hash is a NUMBER value.
- The expr argument determines the data for which you want Oracle Database to compute a hash value. There are no restrictions on the type or length of data represented by expr, which commonly resolves to a column name.
The expr cannot be a LONG or LOB type. It cannot be a user-defined object type unless it is a nested table type.
- The argument max_bucket is optional and it determines the maximum bucket value returned by the hash function. You can specify any value between 0 and 4294967295. The default is 4294967295.
- The optional seed_value argument enables Oracle to produce many different results for the same set of data. Oracle applies the hash function to the combination of expr and seed_value. You can specify any value between 0 and 4294967295. The default is 0.
Joining Tables Via A Hash
CREATE TABLE t1 AS SELECT owner, table_name, tablespace_name FROM all_tables; CREATE TABLE t2 AS SELECT owner, table_name, tablespace_name FROM all_indexes; ALTER TABLE t1 ADD (hashcol NUMBER(38)); ALTER TABLE t2 ADD (hashcol NUMBER(38)); UPDATE t1 SET hashcol = ORA_HASH(owner || table_name || tablespace_name); UPDATE t2 SET hashcol = ORA_HASH(owner || table_name || tablespace_name); CREATE INDEX ix_t1_columns ON t1 (owner, table_name, tablespace_name); CREATE INDEX ix_t2_columns ON t2 (owner, table_name, tablespace_name); CREATE INDEX ix_t1_hash ON t1 (hashcol); CREATE INDEX ix_t2_hash ON t2 (hashcol); SET linesize 121 SELECT * FROM t1 WHERE ROWNUM < 101; SELECT * FROM t2 WHERE ROWNUM < 101; --========================================= SET serveroutput ON DECLARE CURSOR rcur IS SELECT t1.table_name, t2.tablespace_name FROM t1, t2 WHERE t1.table_name = t2.table_name AND t1.tablespace_name = t2.tablespace_name; CURSOR hcur IS SELECT t1.table_name, t2.tablespace_name FROM t1, t2 WHERE t1.hashcol = t2.hashcol; n PLS_INTEGER; BEGIN n := DBMS_UTILITY.get_time; FOR i IN 1..1000 LOOP FOR rec IN rcur LOOP NULL; END LOOP; END LOOP; n := DBMS_UTILITY.get_time - n; DBMS_OUTPUT.put_line('w/o Hashing: ' || n); n := DBMS_UTILITY.get_time; FOR i IN 1..1000 LOOP FOR rec IN hcur LOOP NULL; END LOOP; END LOOP; n := DBMS_UTILITY.get_time - n; DBMS_OUTPUT.put_line('w/ Hashing: ' || n); END; /
Result :
w/o Hashing: 1117 w/ Hashing: 1095 PL/SQL procedure successfully completed.
Random Sampling Data Via A Hash
The following example retrieves a subset of the data in the sh.sales table by specifying 10 buckets (0 to 9) and then returning the data from bucket 1. The expected subset is about 10% of the rows (the sales table has 960 rows):
SELECT * FROM sales WHERE ORA_HASH(cust_id, 9) = 1;