dse spark-submit
Launches applications on a cluster to enable use of Spark cluster managers through a uniform interface. This command supports the same options as Apache Spark spark-submit.
Restriction: Command is supported only on nodes with analytics workloads.
Synopsis
dse spark-submit
--class <class_name>
<jar_file> <other_options>|
--status|--kill <driver_id> [--master <master_ip_address>]
Syntax conventions | Description |
---|---|
UPPERCASE |
Literal keyword. |
Lowercase |
Not literal. |
<`Italics>` |
Variable value. Replace with a valid option or user-defined value. |
|
Optional.
Square brackets ( |
|
Group.
Parentheses ( |
|
Or.
A vertical bar ( |
|
Repeatable.
An ellipsis ( |
|
Single quotation ( |
|
Map collection.
Braces ( |
|
Set, list, map, or tuple.
Angle brackets ( |
|
End CQL statement.
A semicolon ( |
|
Separate the command line options from the command arguments with two hyphens ( |
|
Search CQL only: Single quotation marks ( |
|
Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrconfig files. |
This command supports the same options as Apache Spark spark-submit. Unlike the standard behavior for the Spark status and kill options, in DSE deployments these options do not require the Spark Master IP address.
- kill driver_id
-
Kill a Spark application running in the DSE cluster.
- master master_ip_address
-
The IP address of the Spark Master running in the DSE cluster.
- status driver_id
-
Get the status of a Spark application running in the DSE cluster.
Examples
Run the HTTP response example program (located in the dse-demos directory) on two nodes:
dse spark-submit --class com.datastax.HttpSparkStream target/HttpSparkStream.jar -d 2
To submit an application using cluster mode using the supervise option to restart in case of failure
dse spark-submit --deploy-mode cluster --supervise --class com.datastax.HttpSparkStream target/HttpSparkStream.jar -d $NUM_SPARK_NODES
To submit an application using cluster mode when TLS is enabled
Pass the SSL configuration with standard Spark commands to use secure HTTPS on port 4440.
dse spark-submit \
--conf spark.ssl.ui.enabled=true \
--conf spark.ssl.ui.keyPassword=keystore password \
--conf spark.ssl.ui.keyStore=path to keystore \
myApplication.jar
To set the driver host to a publicly accessible IP address
dse spark-submit --conf spark.driver.host=203.0.113.0 myApplication.jar
To get the status of a driver
Unlike the Apache Spark option, you do not have to specify the Spark Master IP address.
dse spark-submit --status driver-20180726160353-0019
Result when the driver exists:
Driver driver-20180726160353-0019 found: state=<state>, worker=<workerId> (<workerHostPort>)
To kill a driver
Unlike the Apache Spark option, you do not have to specify the Spark Master IP address.
dse spark-submit --kill driver-20180726160353-0019