Starting Apache Spark™ SQL Thrift Server with Kerberos
Spark SQL Thrift Server is a long running service and must be configured to start with a keytab file if Kerberos is enabled.
The user principal must be added to DSE, and Spark SQL Thrift Server restarted with the generated BYOS configuration file and byos-version.jar
.
Prerequisites
These instructions are for the Spark SQL Thrift Server included in HortonWorks 2.4.
The Hadoop Spark SQL Thrift Server principal is hive/_HOST@REALM
.
Procedure
-
Create the principal on the DSE node using
cqlsh
.cqlsh> create user hive/spark_sql_thrift_server_host@REALM;
-
Login as the
hive
user on the Spark SQL Thrift Server host. -
Create a
~/.java.login.config
file with a JAAS Kerberos configuration. -
Merge the existing Spark SQL Thrift Server configuration properties with the generated BYOS configuration file into a new file.
cat /usr/hdp/current/spark-thriftserver/conf/spark-thrift-sparkconf.conf byos.properties > custom-sparkconf.conf
-
Start Spark SQL Thrift Server with the custom configuration file and
byos-version.jar
./usr/hdp/2.4.2.0-258/spark/sbin/start-thriftserver.sh --jars byos-version.jar --properties-file custom-sparkconf.conf
-
Connect using the Beeline client.
beeline -u 'jdbc:hive2://hostname:port/default;principal=hive/_HOST@REALM'
Next steps
Generated SQL schema files can be passed to beeline with the -f
option to generate a mapping for DSE tables so both Hadoop and DataStax Enterprise tables are available through the service for queries.