Spark SQL - Server (Thrift) (STS)
> Database > Spark > Spark - Sql
Table of Contents
1 - About
The spark SQL server is the HiveServer2 in Hive 1.2.1. It's a Thrift JDBC/ODBC server
2 - Articles Related
3 - Version
- beeline from Spark or Hive 1.2.1
- Hive 1.2.1
4 - Configuration
4.1 - High availaibilty
There is not yet a service discovery (SPARK-19541)
Therefore, a load balancer must be put in front of two thrift server.
5 - Management
5.1 - Start
5.1.1 - Linux
To start the JDBC/ODBC server, run the following in the Spark directory:
./sbin/start-thriftserver.sh # From Hortonworks ./sbin/start-thriftserver.sh --master yarn-client --executor-memory 512m --hiveconf hive.server2.thrift.port=10015
5.1.2 - Windows
cd %SPARK_HOME%\bin spark-class2 org.apache.spark.deploy.SparkSubmit --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 spark-internal
5.2 - Connection
5.2.1 - Port
The port can be configured with the following conf parameter: –hiveconf hive.server2.thrift.port=10001
The start output gives you also the port (default:10000)
18/07/18 16:36:05 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY 18/07/18 16:36:05 INFO ObjectStore: Initialized ObjectStore 18/07/18 16:36:05 INFO HiveMetaStore: 0: get_databases: default 18/07/18 16:36:05 INFO audit: ugi=gerard ip=unknown-ip-addr cmd=get_databases: default 18/07/18 16:36:05 INFO HiveMetaStore: 0: Shutting down the object store... 18/07/18 16:36:05 INFO audit: ugi=gerard ip=unknown-ip-addr cmd=Shutting down the object store... 18/07/18 16:36:05 INFO HiveMetaStore: 0: Metastore shutdown complete. 18/07/18 16:36:05 INFO audit: ugi=gerard ip=unknown-ip-addr cmd=Metastore shutdown complete. 18/07/18 16:36:05 INFO AbstractService: Service:ThriftBinaryCLIService is started. 18/07/18 16:36:05 INFO AbstractService: Service:HiveServer2 is started. 18/07/18 16:36:05 INFO HiveThriftServer2: HiveThriftServer2 started 18/07/18 16:36:05 INFO ThriftCLIService: Starting ThriftBinaryCLIService on port 10000 with 5...500 worker threads
5.2.2 - Driver UI
http://172.23.0.1:4040/jobs/ (default)
On HdInsight, you need to go to the Yarn UI to get the driver UI:
5.2.3 - Headnode
Service for connecting to Spark SQL (Thrift/JDBC) is a Spark Thrift servers on the Head nodes (Example: Azure: Port:10002, Protocol: Thrift)
5.2.4 - Azure HdInsight
It's the same than for Hive bust instead of containing httpPath=/hive2
it is httpPath/sparkhive2
- Gateway:
jdbc:hive2://clustername.azurehdinsight.net:443/;ssl=true;transportMode=http;httpPath=/sparkhive2
- HeadNode: ''jdbc:hive2://headnodehost:10002/;transportMode=http
Example with beeline
beeline -u 'jdbc:hive2://headnodehost:10002/;transportMode=http'
5.2.5 - Beeline
beeline
!connect jdbc:hive2://localhost:10000 nico ""
SET; SHOW TABLES;