dse spark-submit
Launches applications on a cluster to enable use of Spark cluster managers through a uniform interface.
Launches applications on a cluster to enable use of Spark cluster managers through a uniform interface. This command supports the same options as Apache Spark spark-submit.
Synopsis
dse spark-submit --class class_name jar_file other_options| --status|--kill driver_id [--master master_ip_address]
Syntax conventions | Description |
---|---|
UPPERCASE | Literal keyword. |
Lowercase | Not literal. |
Italics |
Variable value. Replace with a valid option or user-defined value. |
[ ] |
Optional. Square brackets ( [ ] ) surround optional command
arguments. Do not type the square brackets. |
( ) |
Group. Parentheses ( ( ) ) identify a group to choose from. Do
not type the parentheses. |
| |
Or. A vertical bar ( | ) separates alternative elements. Type
any one of the elements. Do not type the vertical bar. |
... |
Repeatable. An ellipsis ( ... ) indicates that you can repeat
the syntax element as often as required. |
'Literal string' |
Single quotation ( ' ) marks must surround literal strings in
CQL statements. Use single quotation marks to preserve upper case. |
{ key:value } |
Map collection. Braces ( { } ) enclose map collections or key
value pairs. A colon separates the key and the value. |
<datatype1,datatype2> |
Set, list, map, or tuple. Angle brackets ( < > ) enclose
data types in a set, list, map, or tuple. Separate the data types with a comma.
|
cql_statement; |
End CQL statement. A semicolon ( ; ) terminates all CQL
statements. |
[ -- ] |
Separate the command line options from the command arguments with two hyphens (
-- ). This syntax is useful when arguments might be mistaken for
command line options. |
' <schema> ... </schema>
' |
Search CQL only: Single quotation marks ( ' ) surround an entire
XML schema declaration. |
@xml_entity='xml_entity_type' |
Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrconfig files. |
This command supports the same options as Apache Spark spark-submit. Unlike the standard behavior for the Spark status and kill options, in DSE deployments these options do not require the Spark Master IP address.
- kill driver_id
- Kill a Spark application running in the DSE cluster.
- master master_ip_address
- The IP address of the Spark Master running in the DSE cluster.
- status driver_id
- Get the status of a Spark application running in the DSE cluster.
Examples
dse spark-submit --class com.datastax.HttpSparkStream target/HttpSparkStream.jar -d 2
To submit an application using cluster mode using the supervise option to restart in case of failure
dse spark-submit --deploy-mode cluster --supervise --class com.datastax.HttpSparkStream target/HttpSparkStream.jar -d $NUM_SPARK_NODES
To submit an application using cluster mode when TLS is enabled
Pass the SSL configuration with standard Spark commands to use secure HTTPS on port 4440.
dse spark-submit \ --conf spark.ssl.ui.enabled=true \ --conf spark.ssl.ui.keyPassword=keystore password \ --conf spark.ssl.ui.keyStore=path to keystore \ myApplication.jar
To set the driver host to a publicly accessible IP address
dse spark-submit --conf spark.driver.host=203.0.113.0 myApplication.jar
To get the status of a driver
Unlike the Apache Spark option, you do not have to specify the Spark Master IP address.
dse spark-submit --status driver-20180726160353-0019
Driver driver-20180726160353-0019 found: state=<state>, worker=<workerId> (<workerHostPort>)
To kill a driver
Unlike the Apache Spark option, you do not have to specify the Spark Master IP address.
dse spark-submit --kill driver-20180726160353-0019