Oracle Database - Character Set Functions : CONVERT, UNISTR

> Database > Oracle Database

1 - About

character set Functions : CONVERT, UNISTR

3 - Functions

3.1 - CONVERT

CONVERT converts a character string from one character set to another. The datatype of the returned value is VARCHAR2.

  • The char argument is the value to be converted. It can be any of the datatypes CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB, or NCLOB.
  • The dest_char_set argument is the name of the character set to which char is converted.
  • The source_char_set argument is the name of the character set in which char is stored in the database. The default value is the database character set.

Both the destination and source character set arguments can be either literals or columns containing the name of the character set.

For complete correspondence in character conversion, it is essential that the destination character set contains a representation of all the characters defined in the source character set. Where a character does not exist in the destination character set, a replacement character appears. Replacement characters can be defined as part of a character set definition.

The following example illustrates character set conversion by converting a Latin-1 string to ASCII. The result is the same as importing the same string from a WE8ISO8859P1 database to a US7ASCII database.

SELECT CONVERT('Ä Ê Í Õ Ø A B C D E ', 'US7ASCII', 'WE8ISO8859P1') 
   FROM DUAL; 
 
CONVERT('ÄÊÍÕØABCDE' 
--------------------- 
A E I ? ? A B C D E ? 

Common character sets include:

  • US7ASCII: US 7-bit ASCII character set
  • WE8DEC: West European 8-bit character set
  • WE8HP: HP West European Laserjet 8-bit character set
  • F7DEC: DEC French 7-bit character set
  • WE8EBCDIC500: IBM West European EBCDIC Code Page 500
  • WE8PC850: IBM PC Code Page 850
  • WE8ISO8859P1: ISO 8859-1 West European 8-bit character set
Advertising

3.2 - UNISTR

UNISTR takes as its argument a string in any character set and returns it in Unicode in the database Unicode character set. To include UCS2 codepoint characters in the string, use the escape backslash (\) followed by the next number. To include the backslash itself, precede it with another backslash (\\).

This function is similar to the TRANSLATE … USING function, except that UNISTR offers the escape character for UCS2 codepoints and backslash characters.

The following example returns the Unicode equivalent of its character string:

SELECT UNISTR('\00D6') FROM DUAL;
 
UN
--
Ö