Oracle Database - PL/SQL - Using utl_match functions to compare string similarity

1 - About

The four functions included in the package use different methods to compare a source string and destination string, and return an assessment of what it would take to turn the source string into the destination string.

3 - Functions

3.1 - EDIT DISTANCE

Returns the number of changes required to turn the source string into the destination string using the Levenshtein Distance algorithm.

utl_match.edit_distance(s1 IN VARCHAR2, s2 IN VARCHAR2) RETURN PLS_INTEGER;
SELECT utl_match.edit_distance('street', 'str') DIST
FROM dual;

3.2 - EDIT DISTANCE SIMILARITY

Returns an integer between 0 and 100, where 0 indicates no similarity at all and 100 indicates a perfect match.

utl_match.edit_distance_similarity(s1 IN VARCHAR2, s2 IN VARCHAR2) RETURN PLS_INTEGER;
SELECT utl_match.edit_distance_similarity('street', 'str') SIM
FROM dual;

3.3 - JARO WINKLER

Instead of simply calculating the number of steps required to change the source string to the destination string, determines how closely the two strings agree with each other and tries to take into account the possibility of a data entry error.

utl_match.jaro_winkler(s1 IN VARCHAR2, s2 IN VARCHAR2)RETURN BINARY_DOUBLE;
SELECT utl_match.jaro_winkler('street', 'str') DIST
FROM dual;

3.4 - JARO WINKLER SIMILARITY

Returns an integer between 0 and 100, where 0 indicates no similarity at all and 100 indicates a perfect match but tries to take into account possible data entry errors.

utl_match.jaro_winkler_similarity(s1 IN VARCHAR2, s2 IN VARCHAR2) RETURN PLS_INTEGER;
SELECT utl_match.jaro_winkler_similarity('street', 'str') SIM
FROM dual;

4 - Reference

  • Source : $ORACLE_HOME/rdbms/admin/utlmatch.sql
db/oracle/utl_match.txt ยท Last modified: 2017/09/06 19:29 by gerardnico