Hive - Text File (TEXTFILE)

> Database > Apache - Hive (HS|Hive Server)

1 - About

TEXTFILE is the default storage format of a table

STORED AS TEXTFILE is normally the storage format and is then optional.

Advertising

3 - Default

3.1 - Delimiters

The delimiters are assumed to be ^A(ctrl-a).

4 - Syntax

4.1 - STORED AS TEXTFILE

Example with the customer table of the TPCDS schema

CREATE external TABLE customer_row
(
    c_customer_sk             BIGINT,
    c_customer_id             string,
    c_current_cdemo_sk        BIGINT,
    c_current_hdemo_sk        BIGINT,
    c_current_addr_sk         BIGINT,
    c_first_shipto_date_sk    BIGINT,
    c_first_sales_date_sk     BIGINT,
    c_salutation              string,
    c_first_name              string,
    c_last_name               string,
    c_preferred_cust_flag     string,
    c_birth_day               INT,
    c_birth_month             INT,
    c_birth_year              INT,
    c_birth_country           string,
    c_login                   string,
    c_email_address           string,
    c_last_review_date        string
)
ROW format delimited FIELDS TERMINATED BY '|' 
STORED AS TEXTFILE
LOCATION 'hdfs://locationToMyDirectory';

where you can use the following clause

  • DELIMITED
  • ESCAPED BY to enable escaping
  • NULL DEFINED AS - A custom NULL format (default is \N)

4.2 - STORED AS INPUTFORMAT/OUTPUTFORMAT

STORED AS 
INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
db/hive/text_file.txt · Last modified: 2019/05/28 10:39 by gerardnico