Starting Spark SQL Thrift Server with Kerberos

Starting Spark SQL Thrift Server with Kerberos and BYOS.

Spark SQL Thrift Server is a long running service and must be configured to start with a keytab file if Kerberos is enabled. The user principal must be added to DSE, and Spark SQL Thrift Server restarted with the generated BYOS configuration file and byos-version.jar.

Prerequisites

These instructions are for the Spark SQL Thrift Server included in HortonWorks 2.4. The Hadoop Spark SQL Thrift Server principal is hive/_HOST@REALM.

Procedure

  1. Create the principal on the DSE node using cqlsh.
    create user hive/spark_sql_thrift_server_host@REALM;
  2. Login as the hive user on the Spark SQL Thrift Server host.
  3. Create a ~/.java.login.config file with a JAAS Kerberos configuration.
  4. Merge the existing Spark SQL Thrift Server configuration properties with the generated BYOS configuration file into a new file.
    cat /usr/hdp/current/spark-thriftserver/conf/spark-thrift-sparkconf.conf byos.properties > custom-sparkconf.conf
  5. Start Spark SQL Thrift Server with the custom configuration file and byos-version.jar.
    /usr/hdp/2.4.2.0-258/spark/sbin/start-thriftserver.sh --jars byos-version.jar --properties-file custom-sparkconf.conf
  6. Connect using the Beeline client.
    beeline -u 'jdbc:hive2://hostname:port/default;principal=hive/_HOST@REALM'

What's next

Generated SQL schema files can be passed to beeline with the -f option to generate a mapping for DSE tables so both Hadoop and DataStax Enterprise tables will be available through the service for queries.