dse client-tool spark
Perform operations related to integrated Apache Spark™.
Synopsis
dse client-tool <connection_options> spark
(master-address | leader-address | version |
sql-schema (--exclude | --keyspace | --table | --decimal | --all)
metastore-migrate --<from_version> --<to_version>)
Syntax legend
| Syntax conventions | Description |
|---|---|
Italic, bold, or |
Syntax diagrams and code samples use one or more of these styles to mark placeholders for variable values. Replace placeholders with a valid option or your own user-defined value. In CQL statements, angle brackets are required to enclose data types in a set, list, map, or tuple.
Separate the data types with a comma.
For example: In Search CQL statements, angle brackets are used to identify the entity and literal value to overwrite the XML element in the schema and |
|
Square brackets surround optional command arguments. Do not type the square brackets. |
|
Parentheses identify a group to choose from. Do not type the parentheses. |
|
A pipe separates alternative elements. Type any one of the elements. Do not type the pipe. |
|
Indicates that you can repeat the syntax element as often as required. |
|
Single quotation marks must surround literal strings in CQL statements.
Use single quotation marks to preserve upper case.
+
For Search CQL only: Single quotation marks surround an entire XML schema declaration, such as |
|
Map collection.
Curly braces enclose maps ( |
|
Ends a CQL statement. |
|
Separate command line options from command arguments with two hyphens. This syntax is useful when arguments might be mistaken for command line options. |
- leader-address
-
Returns the IP address of the currently selected Spark Master for the datacenter.
- master-address
-
Returns the localhost IP address used to configure Spark applications. The address is returned as URI:
dse://<ip>:<port>?connection.local_dc=<dc_name>;connection.host=<cs_list_contactpoints>;
The
connection.host=cs_list_contactpointsoption is a comma separated list of IP addresses of additional contact points. The additional contact points are up to five randomly selected nodes from the datacenter.DSE automatically connects Spark applications to the Spark Master. You do not need to use the IP address of the current Spark Master in the connection URI.
- metastore-migrate --from_version --to_version
-
Migrate Spark SQL metastore from one DSE version to another DSE version.
-
--from_version - the version to migrate metastore from
-
--to_version - the version to migrate metastore to
-
- version
-
Returns the version of Apache Spark that is bundled with DSE.
- sql-schema (--exclude | --keyspace | --table | --decimal | --all)
-
Exports the SQL table creation query with these options:
-
--table tablename - comma-separated list of tables to include
-
--exclude csvlist - comma-separated list of tables to exclude
-
--all - includes all keyspaces
-
--keyspace csvlist - comma-separated list of keyspaces to include
-
Examples
View the Apache Spark connection URL for this datacenter:
dse client-tool spark master-address
dse://10.200.181.62:9042?connection.local_dc=Analytics;connection.host=10.200.181.63;
View the IP address of the current Apache Spark Master in this datacenter:
dse client-tool spark leader-address 10.200.181.62
Generate Apache Spark SQL schema files
You can use the generated schema files with Spark SQL on external Spark clusters.
dse client-tool --use-server-config spark sql-schema --all > output.sql
Migrate Apache Spark metastore
To map custom external tables from DSE 5.0.11 to the DSE 6.7.0 release format of the Hive metastore used by Spark SQL after upgrading:
dse client-tool spark metastore-migrate --from 5.0.11 --to 6.7.0