Accessing the Spark SQL Thrift Server with the Simba JDBC driver

The Simba JDBC driver allows you to access the Spark SQL Thrift Server

The Simba JDBC Driver for Spark provides a standard JDBC interface to the information stored in DataStax Enterprise with the Spark SQL Thrift Server running.

Your DSE license includes a license to use the Simba drivers.

Prerequisites

You must have a running DSE Analytics cluster with Spark enabled, and one node in the cluster running the Spark SQL Thrift Server.

Procedure

  1. Download the Simba JDBC Driver for Apache Spark from the DataStax Drivers Download page.
  2. Expand the ZIP file containing the driver.
  3. In your JDBC application, configure the following details:
    1. Add SparkJDBC41.jar and the rest of the JAR files included in the ZIP file in your classpath.
    2. The JDBC driver class is com.simba.spark.jdbc41.Driver and the JDBC data source is com.simba.spark.jdbc41.DataSource.
    3. Set the connection URL to jdbc:spark://<hostname>:<port> where <hostname> is the hostname of the node on which the Spark SQL Thrift Server is running, and <port> is the port number on which the Spark SQL Thrift Server is listening.
      jdbc:spark://node1.example.com:10000
  4. For more details, refer to the included documentation in the Simba driver download ZIP.