dse spark-submit

Launches applications on a cluster to enable use of Spark cluster managers through a uniform interface. This command supports the same options as Apache Spark™ spark-submit.

Restriction: Command is supported only on nodes with analytics workloads.

Synopsis

$ dse spark-submit
--class class_name
jar_file other_options|
[--master master_ip_address]
Syntax conventions
Syntax conventions Description

UPPERCASE

Literal keyword.

Lowercase

Not literal.

Italics

Variable value. Replace with a valid option or user-defined value.

[ ]

Optional. Square brackets ( [ ] ) surround optional command arguments. Do not type the square brackets.

( )

Group. Parentheses ( ( ) ) identify a group to choose from. Do not type the parentheses.

|

Or. A vertical bar ( | ) separates alternative elements. Type any one of the elements. Do not type the vertical bar.

...

Repeatable. An ellipsis ( ... ) indicates that you can repeat the syntax element as often as required.

'Literal string'

Single quotation ( ' ) marks must surround literal strings in CQL statements. Use single quotation marks to preserve upper case.

{ key:value }

Map collection. Braces ( { } ) enclose map collections or key value pairs. A colon separates the key and the value.

<datatype1,datatype2>

Set, list, map, or tuple. Angle brackets ( < > ) enclose data types in a set, list, map, or tuple. Separate the data types with a comma.

cql_statement;

End CQL statement. A semicolon ( ; ) terminates all CQL statements.

[ -- ]

Separate the command line options from the command arguments with two hyphens ( -- ). This syntax is useful when arguments might be mistaken for command line options.

' <schema> …​ </schema> '

Search CQL only: Single quotation marks ( ' ) surround an entire XML schema declaration.

@xml_entity='xml_entity_type'

Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrconfig files.

This command supports the same options as Apache Spark spark-submit. Unlike the standard behavior for the Spark status and kill options, in DSE deployments these options do not require the Spark Master IP address.

master master_ip_address

The IP address of the Spark Master running in the DSE cluster.

Examples

To write a class that defines an option named d

$ dse spark-submit --class com.datastax.HttpSparkStream target/HttpSparkStream.jar -d $NUM_SPARK_NODES

To submit an application using cluster mode using the supervise option to restart in case of failure

$ dse spark-submit --deploy-mode cluster --supervise --class com.datastax.HttpSparkStream target/HttpSparkStream.jar -d $NUM_SPARK_NODES

To submit an application using cluster mode when TLS is enabled

Pass the SSL configuration with standard Spark commands to use secure HTTPS on port 4440.

$ dse spark-submit \
--conf spark.ssl.ui.enabled=true \
--conf spark.ssl.ui.keyPassword=keystore password \
--conf spark.ssl.ui.keyStore=path to keystore \
myApplication.jar

To set the driver host to a publicly accessible IP address

$ dse spark-submit --conf spark.driver.host=203.0.113.0 myApplication.jar

To get the status of a driver

Unlike the Apache Spark option, you do not have to specify the Spark Master IP address.

$ dse spark-submit --status driver-20180726160353-0019

Result when the driver exists:

Driver driver-20180726160353-0019 found: state=<state>, worker=<workerId> (<workerHostPort>)

To kill a driver

Unlike the Apache Spark option, you do not have to specify the Spark Master IP address.

$ dse spark-submit --kill driver-20180726160353-0019

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com