Connecting to DataStax Enterprise using the Spark shell on an external Spark cluster
Use the Spark shell on an external Spark cluster to connect to DataStax Enterprise.
Use the generated byos.properties configuration file and the byos-version.jar from a DataStax Enterprise node to connect to the DataStax Enterprise cluster from the Spark shell on an external Spark cluster.
clients directory
The default location of the clients directory depends on the type of installation:Package installations | /usr/share/dse/clients |
Tarball installations | installation_location/clients |
Prerequisites
You must generate the byos.properties on a node in your DataStax Enterprise cluster.
Procedure
-
Copy the byos.properties file you previously generated
from the DataStax Enterprise node to the local Spark node.
scp user@dsenode1.example.com:~/byos.properties .
If you are using Kerberos authentication, specify the --generate-token and --token-renewer <username> options when generating byos.properties, as described in .
-
Copy the byos-version.jar file from the
clients directory from a node in your DataStax
Enterprise cluster to the local Spark node.
The byos-version.jar file location depends on the type of installation.
scp user@dsenode1.example.com:/usr/share/dse/clients/dse-byos_2.11-6.0.2.jar byos-6.0.jar
-
Merge external Spark properties into
byos.properties.
cat ${SPARK_HOME}/conf/spark-defaults.conf >> byos.properties
- Optional:
If you are using Kerberos authentication, set up a CRON job or other task
scheduler to periodically call dse client-tool cassandra renew-token
<token> where
<token>
is the encoded token string in byos.properties. -
Start the Spark shell using the byos.properties and
byos-version.jar file.
spark-shell --jars byos-6.0.jar --properties-file byos.properties