dse spark-submit
Launches applications on a cluster to enable use of Spark cluster managers through a uniform interface. This command supports the same options as Apache Spark™ spark-submit.
Restriction: Command is supported only on nodes with analytics workloads.
Synopsis
$ dse spark-submit
--class class_name
jar_file other_options|
[--master master_ip_address]
Syntax conventions
Syntax conventions | Description |
---|---|
UPPERCASE |
Literal keyword. |
Lowercase |
Not literal. |
|
Variable value. Replace with a valid option or user-defined value. |
|
Optional.
Square brackets ( |
|
Group.
Parentheses ( |
|
Or.
A vertical bar ( |
|
Repeatable.
An ellipsis ( |
|
Single quotation ( |
|
Map collection.
Braces ( |
|
Set, list, map, or tuple.
Angle brackets ( |
|
End CQL statement.
A semicolon ( |
|
Separate the command line options from the command arguments with two hyphens ( |
|
Search CQL only: Single quotation marks ( |
|
Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrconfig files. |
This command supports the same options as Apache Spark spark-submit. Unlike the standard behavior for the Spark status and kill options, in DSE deployments these options do not require the Spark Master IP address.
master master_ip_address
-
The IP address of the Spark Master running in the DSE cluster.
Examples
To write a class that defines an option named d
$ dse spark-submit --class com.datastax.HttpSparkStream target/HttpSparkStream.jar -d $NUM_SPARK_NODES
To submit an application using cluster mode using the supervise option to restart in case of failure
$ dse spark-submit --deploy-mode cluster --supervise --class com.datastax.HttpSparkStream target/HttpSparkStream.jar -d $NUM_SPARK_NODES
To submit an application using cluster mode when TLS is enabled
Pass the SSL configuration with standard Spark commands to use secure HTTPS on port 4440.
$ dse spark-submit \
--conf spark.ssl.ui.enabled=true \
--conf spark.ssl.ui.keyPassword=keystore password \
--conf spark.ssl.ui.keyStore=path to keystore \
myApplication.jar
To set the driver host to a publicly accessible IP address
$ dse spark-submit --conf spark.driver.host=203.0.113.0 myApplication.jar
To get the status of a driver
Unlike the Apache Spark option, you do not have to specify the Spark Master IP address.
$ dse spark-submit --status driver-20180726160353-0019
Result when the driver exists:
Driver driver-20180726160353-0019 found: state=<state>, worker=<workerId> (<workerHostPort>)
To kill a driver
Unlike the Apache Spark option, you do not have to specify the Spark Master IP address.
$ dse spark-submit --kill driver-20180726160353-0019