dse client-tool spark

Connect an external client to a DataStax Enterprise node and perform operations related to integrated Spark.

Perform operations related to integrated Spark.

Synopsis

dse client-tool connection_options spark 
(master-address | leader-address | version | 
sql-schema (--exclude | --keyspace | --table | --decimal | --all)
metastore-migrate --from_version --to_version)
Table 1. Legend
Syntax conventions Description
UPPERCASE Literal keyword.
Lowercase Not literal.
Italics Variable value. Replace with a valid option or user-defined value.
[ ] Optional. Square brackets ( [ ] ) surround optional command arguments. Do not type the square brackets.
( ) Group. Parentheses ( ( ) ) identify a group to choose from. Do not type the parentheses.
| Or. A vertical bar ( | ) separates alternative elements. Type any one of the elements. Do not type the vertical bar.
... Repeatable. An ellipsis ( ... ) indicates that you can repeat the syntax element as often as required.
'Literal string' Single quotation ( ' ) marks must surround literal strings in CQL statements. Use single quotation marks to preserve upper case.
{ key:value } Map collection. Braces ( { } ) enclose map collections or key value pairs. A colon separates the key and the value.
<datatype1,datatype2> Set, list, map, or tuple. Angle brackets ( < > ) enclose data types in a set, list, map, or tuple. Separate the data types with a comma.
cql_statement; End CQL statement. A semicolon ( ; ) terminates all CQL statements.
[ -- ] Separate the command line options from the command arguments with two hyphens ( -- ). This syntax is useful when arguments might be mistaken for command line options.
' <schema> ... </schema> ' Search CQL only: Single quotation marks ( ' ) surround an entire XML schema declaration.
@xml_entity='xml_entity_type' Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrconfig files.
leader-address
Returns the IP address of the currently selected Spark Master for the datacenter.
master-address
Returns the localhost IP address used to configure Spark applications. The address is returned as URI:
dse://ip:port?connection.local_dc=dc_name;connection.host=cs_list_contactpoints;

The connection.host=cs_list_contactpoints option is a comma separated list of IP addresses of additional contact points. The additional contact points are up to five randomly selected nodes from the datacenter.

Note: DSE automatically connects Spark applications to the Spark Master. You do not need to use the IP address of the current Spark Master in the connection URI.
metastore-migrate --from_version --to_version
Migrate Spark SQL metastore from one DSE version to another DSE version.
  • --from_version - the version to migrate metastore from
  • --to_version - the version to migrate metastore to
version
Returns the version of Spark that is bundled with DataStax Enterprise.
sql-schema (--exclude | --keyspace | --table | --decimal | --all)
Exports the SQL table creation query with these options:
  • --table tablename - comma-separated list of tables to include
  • --exclude csvlist - comma-separated list of tables to exclude
  • --all - includes all keyspaces
  • --keyspace csvlist - comma-separated list of keyspaces to include

Examples

View the Spark connection URL for this datacenter:

dse client-tool spark master-address
dse://10.200.181.62:9042?connection.local_dc=Analytics;connection.host=10.200.181.63;

View the IP address of the current Spark Master in this datacenter:

dse client-tool spark leader-address 10.200.181.62

Generate Spark SQL schema files

You can use the generated schema files with Spark SQL on external Spark clusters.

dse client-tool --use-server-config spark sql-schema --all > output.sql

Migrate Spark metastore

To map custom external tables from DSE 5.0.11 to the DSE 6.7.0 release format of the Hive metastore used by Spark SQL after upgrading:

dse client-tool spark metastore-migrate --from 5.0.11 --to 6.7.0