Hive - URI (Connection URL)

> Database > Apache - Hive (HS|Hive Server)

1 - About

The JDBC URL in a Hive Context to make a Hive JDBC connection.

See Ref doc

Advertising

3 - Syntax

3.1 - Driver and prefix

The JDBC connection URL format has:

  • the prefix jdbc:hive2://
  • and the Driver class is org.apache.hive.jdbc.HiveDriver

3.2 - Zookeeper

A Zookeeper connection (when Service Discovery Is Enabled) process is:

  • Every HiveServer2 instances registers itself with ZooKeeper
  • The client driver connects to the ZooKeeper ensemble and asks for a <host>:<port> (znode)
  • ZooKeeper randomly returns <host>:<port> for one of the registered HiveServer2 instances to the JDBC driver
  • The JDBC driver does a connection to this <host>:<port> with the options defined.
jdbc:hive2://<zookeeper quorum>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=<hiveserver2_namespace>?option
# Example 
jdbc:hive2://<zookeeper quorum>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2?tez.queue.name=hive1&hive.server2.thrift.resultset.serialize.in.tasks=true

where:

  • <zookeeper quorum> is the same as the value of hive.zookeeper.quorum configuration parameter in hive-site.xml/hivserver2-site.xml
  • and hive conf as query parameters
Advertising

3.3 - Direct connection

A client,

Remote HiveServer2 mode is recommended for production use, as it is more secure and doesn't require direct HDFS/metastore access to be granted for users.

3.3.1 - Remote

In remote mode, connection to a separate HiveServer2 process over Thrift.

If you don't have a “/” after the port number, the jdbc driver does not parse the hostname and ends up running HS2 in embedded mode . So if you are specifying a hostname, make sure you have a “/” or “/<dbname>” after the port number.

  • Without SSL
# Default
jdbc:hive2://<host1>:<port1|10000>,<host2>:<port2>/dbName;initFile=<file>;sess_var_list?hive_conf_list#hive_var_list
 
# in http transport mode
jdbc:hive2://<host>:<port|10001>/<db>;transportMode=http;httpPath=<http_endpoint>
  • Ssl
# Binary Mode
 jdbc:hive2://<host>:<port>/<db>;ssl=true;sslTrustStore=<trust_store_path>;trustStorePassword=<trust_store_password>
 
# Http Mode
jdbc:hive2://<host>:<port>/<db>;ssl=true;sslTrustStore=<trust_store_path>;trustStorePassword=<trust_store_password>;transportMode=http;httpPath=<http_endpoint>

Where

  • <host> is name or IP address of the machine on which HiveServer2 runs.
  • <port> is the port number on which HiveServer2 listens
    • Default to 10000 in binary mode
    • Default to 10001 in http transport mode. See conf hive.server2.thrift.port when hive.server2.transport.mode is http.
  • <db> is the database name to which you want to connect. (Default = default)
  • <host1>:<port1>,<host2>:<port2> is a server instance or a comma separated list of server instances to connect to (if dynamic service discovery is enabled). If empty, the embedded server will be used.
  • dbName is the name of the initial database.
  • file is the path of init script file (Hive 2.2.0 and later). This script file is written with SQL statements which will be executed automatically after connection. This option can be empty.
  • sess_var_list is a semicolon separated list of key=value pairs of session variables (e.g., user=foo;password=bar).
  • hive_conf_list is a semicolon separated list of key=value pairs of Hive configuration variables for this session
  • hive_var_list is a semicolon separated list of key=value pairs of Hive variables for this session.
  • <trust_store_path> is the path where client's truststore file lives.
  • <trust_store_password> is the password to access the truststore.
  • <http_endpoint> is the corresponding HTTP endpoint. Default value is cliservice or /. See conf hive.server2.thrift.http.path
  • transportMode : Hive - Transport Mode. Default port for HTTP transport mode is 10001. See conf hive.server2.thrift.http.port
Advertising

3.3.2 - Embedded

In the embedded mode, the client runs an embedded Hive (similar to Hive CLI)

jdbc:hive2:///;initFile=<file> (no host or port).

If the Hadoop cluster uses SSL or TLS authentication, you must add ssl=true to the JDBC connection URI.

jdbc:hive2://<hostname>:<port>/<db>;ssl=true

4 - Example

4.1 - Azure

4.1.1 - Zookeeper Quorum

  • Zookeeper Quorum
jdbc:hive2://zk0-hi-clu.3qy32mhqlj1ubaea5iyw5joamf.ax.internal.cloudapp.net:2181,zk1-hi-clu.3qy32mhqlj1ubaea5iyw5joamf.ax.internal.cloudapp.net:2181,zk6-hi-clu.3qy32mhqlj1ubaea5iyw5joamf.ax.internal.cloudapp.net:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2

4.1.2 - Headnode

  • Headnode (Example Azure with default database)
jdbc:hive2://headnode:10001/default/;transportMode=http;httpPath=/hive2

5 - Support

5.1 - Invalid status 72 (state=08S01,code=0)

beeline> !connect jdbc:hive2://localhost:10001/default; user pwd
Connecting to jdbc:hive2://localhost:10001/default;
18/05/28 10:38:09 [main]: WARN jdbc.HiveConnection: Failed to connect to localhost:10001
Error: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10001/default;: Invalid status 72 (state=08S01,code=0)

Error 72 is a bad url issue

Try to add the transport mode, it 's mandatory if you are in http transport mode

jdbc:hive2://headnode:10001/default/;transportMode=http;httpPath=/hive2

6 - Documentation / Reference

db/hive/uri.txt · Last modified: 2019/06/04 15:07 by gerardnico