dse client-tool spark
Connect an external client to a DataStax Enterprise node and perform operations related to integrated Spark.
Perform operations related to integrated Spark.
Synopsis
dse client-tool connection_options spark
(master-address | leader-address | version |
sql-schema (--exclude | --keyspace | --table | --decimal | --all)
metastore-migrate --from_version --to_version)
Syntax conventions | Description |
---|---|
UPPERCASE | Literal keyword. |
Lowercase | Not literal. |
Italics |
Variable value. Replace with a valid option or user-defined value. |
[ ] |
Optional. Square brackets ( [ ] ) surround optional command
arguments. Do not type the square brackets. |
( ) |
Group. Parentheses ( ( ) ) identify a group to choose from. Do
not type the parentheses. |
| |
Or. A vertical bar ( | ) separates alternative elements. Type
any one of the elements. Do not type the vertical bar. |
... |
Repeatable. An ellipsis ( ... ) indicates that you can repeat
the syntax element as often as required. |
'Literal string' |
Single quotation ( ' ) marks must surround literal strings in
CQL statements. Use single quotation marks to preserve upper case. |
{ key:value } |
Map collection. Braces ( { } ) enclose map collections or key
value pairs. A colon separates the key and the value. |
<datatype1,datatype2> |
Set, list, map, or tuple. Angle brackets ( < > ) enclose
data types in a set, list, map, or tuple. Separate the data types with a comma.
|
cql_statement; |
End CQL statement. A semicolon ( ; ) terminates all CQL
statements. |
[ -- ] |
Separate the command line options from the command arguments with two hyphens (
-- ). This syntax is useful when arguments might be mistaken for
command line options. |
' <schema> ... </schema>
' |
Search CQL only: Single quotation marks ( ' ) surround an entire
XML schema declaration. |
@xml_entity='xml_entity_type' |
Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrconfig files. |
- leader-address
- Returns the IP address of the currently selected Spark Master for the datacenter.
- master-address
- Returns the localhost IP address used to configure Spark applications. The address
is returned as URI:
dse://ip:port?connection.local_dc=dc_name;connection.host=cs_list_contactpoints;
The
connection.host=cs_list_contactpoints
option is a comma separated list of IP addresses of additional contact points. The additional contact points are up to five randomly selected nodes from the datacenter.Note: DSE automatically connects Spark applications to the Spark Master. You do not need to use the IP address of the current Spark Master in the connection URI. - metastore-migrate --from_version --to_version
- Migrate Spark SQL metastore from one DSE version to another DSE version.
- --from_version - the version to migrate metastore from
- --to_version - the version to migrate metastore to
- version
- Returns the version of Spark that is bundled with DataStax Enterprise.
- sql-schema (--exclude | --keyspace | --table | --decimal | --all)
- Exports the SQL table creation query with these options:
- --table tablename - comma-separated list of tables to include
- --exclude csvlist - comma-separated list of tables to exclude
- --all - includes all keyspaces
- --keyspace csvlist - comma-separated list of keyspaces to include
Examples
View the Spark connection URL for this datacenter:
dse client-tool spark master-address dse://10.200.181.62:9042?connection.local_dc=Analytics;connection.host=10.200.181.63;
View the IP address of the current Spark Master in this datacenter:
dse client-tool spark leader-address 10.200.181.62
Generate Spark SQL schema files
You can use the generated schema files with Spark SQL on external Spark clusters.
dse client-tool --use-server-config spark sql-schema --all > output.sql
Migrate Spark metastore
To map custom external tables from DSE 5.0.11 to the DSE 6.0.0 release format of the Hive metastore used by Spark SQL after upgrading:
dse client-tool spark metastore-migrate --from 5.0.11 --to 6.0.0