Sqoop command

Use the dse sqoop command to import SQL data to CQL, export SQL data to CQL, and import Thrift/CLI data to CQL.

For use with Sqoop, use the DataStax Enterprise dse sqoop command to import SQL data to CQL, export SQL data to CQL, and import Thrift/CLI data to CQL.

All of the command options can be used in the import.options file.

$ dse sqoop action --connect jdbc_url --cassandra-keyspace ks --cassandra-table cf --cassandra-host host --table sql_table

where action is one of the following command options:

  • Import SQL data into CQL:
    cql-import
  • Export SQL data to CQL.
    cql-export
  • Import Thrift/CLI data to CQL.
    thrift-import
The following cql-import and cql-export command options are supported.

Cluster options 

These options define Cassandra cluster settings.
cql-import and cql-export cluster options
Command Description
--cassandra-consistency-level consistencylevel The Cassandra consistency level, which is LOCAL_ONE by default.
--cassandra-host host A comma separated list of Cassandra hosts.
--cassandra-partitioner partitioner The Cassandra partitioner, which is Murmur3Partitioner by default.
--cassandra-port port The Cassandra port.

Data options 

These data options work with the cql-import and cql-export tools, except where noted.

cql-import and cql-export data options
Command Description
--query sql_query Supports importing SQL joins.
--cassandra-column-mapping map

... where map = cql1:sql1,cql2:sql2

... CQLLISTSET:[SQLCOL,SQLCOL,SQLCOL]

... CQLMAP:[SQLCOL:VALCOL,SQLCOL:VALCOL]

Supports mapping ambiguous columns for import/export.
Maps cql and sql columns (not collections) for import/export.
Maps a list or set type for import/export.
Handles importing/exporting of a map type.
--cassandra-page-size cql-export only. Limits the page size of columns selected for export.
--cassandra-select-columns cql-export only. Select the named columns to export.
--cassandra-where-clause cql-export only. Filter the data selected for export based on the where condition.

Connection options 

Sqoop, by default, identifies the JDBC driver to use from the JDBC URL. For drivers that are not supported directly by Sqoop, you must specify the driver class name using the --driver option. To use the DataStax Enterprise specific tools with a custom driver, you must tell Sqoop which connection manager to use.

The --connection-manager parameter is required when the --driver option is specified for cql-import and cql-export actions. For example:
dse sqoop cql-import --driver driver --connection-manager com.datastax.bdp.sqoop.DseConnectionManager
dse sqoop cql-export --driver driver --connection-manager com.datastax.bdp.sqoop.DseConnectionManager 
cql-import and cql-export connection options
Command Description
--driver For drivers that are not directly supported by Sqoop, specify the driver class name. For example:

com.informix.jdbc.IfxDriver or

com.teradata.jdbc.TeraDriver
--connection-manager Specifies the connection manager for import or export to DataStax Enterprise. Required when the --driver option is used. The required value is:

com.datastax.bdp.sqoop.DseConnectionManager

Security options 

These security options are supported for the cql-import and cql-export actions.

cql-import and cql-export security options
Command Description
--cassandra-enable-kerberos Enables kerberos authentication
--cassandra-kerberos-config-path jaas.config_path Path to the users jaas.config file
--cassandra-enable-ssl Enables SSL transport
--cassandra-ssl-protocol protocol Configures the SSL protocol
--cassandra-truststore-algo algo Configures the SSL trust store algorithm
--cassandra-truststore-ciphers ciphers Configures the SSL trust store ciphers
--cassandra-truststore-location location Path to the SSL trust store
--cassandra-truststore-password tspassword Configures the SSL trust store password
--cassandra-truststore-type type Configures the SSL trust store type
--cassandra-username username Authenticates password, works only with the local Job Tracker
--cassandra-password password Authenticates password
--cassandra-kerberos-service-principal service_principal The Kerberos principal for which you have created a ticket using kinit

Conversion data types 

These data types are supported for conversion from SQL to CQL.

Allowable data type conversions for importing SQL to CQL
SQL data type CQL data type
VARCHAR text, ascii, varchar
BIT boolean, text, ascii, varchar
BIT(1) boolean, text, ascii, varchar
BIT(>1) blob
TINYINT int, bigint, varint, float, double, decimal, text, ascii, varchar
SMALLINT int, bigint, varint, float, double, decimal, text, ascii, varchar
INTEGER int, bigint, varint, float, double, decimal, text, ascii, varchar
BIGINT bigint, varint, float, double, decimal, text, ascii, varchar
FLOAT float, double, decimal, text, ascii, varchar
DOUBLE double, decimal, text, ascii, varchar
DECIMAL decimal, text, ascii, varchar
NUMERIC decimal, text, ascii, varchar
BLOB blob
CLOB blob, text, ascii, varchar
BINARY(n) blob, text, ascii, varchar
VARBINARY(n) blob, text, ascii, varchar
DATE timestamp, text, ascii, varchar
TIME timestamp, text, ascii, varchar
TIMESTAMP timestamp, text, ascii, varchar

Export data types 

Use the following data type map for exporting from CQL to SQL.

Export data types
CQL Type SQL Type
int TINYINT, SMALLINT, INTEGER, BIGINT, FLOAT, DOUBLE, DECIMAL, NUMERIC, VARCHAR
bigint BIGINT, FLOAT, DOUBLE, DECIMAL, NUMERIC, VARCHAR
varint DECIMAL, NUMERIC, VARCHAR
float FLOAT, DOUBLE, DECIMAL, NUMERIC, VARCHAR
double DOUBLE, DECIMAL, NUMERIC, VARCHAR
decimal DECIMAL, NUMERIC, VARCHAR
ascii VARCHAR, CLOB, BLOB, VARBINARY
text VARCHAR, CLOB, BLOB, VARBINARY
varchar VARCHAR, CLOB, BLOB, VARBINARY
timestamp DATE, TIME, TIMESTAMP, VARCHAR
boolean BOOLEAN, BIT, BIT(1), VARCHAR
blob BLOB, VARBINARY
inet VARCHAR
uuid VARCHAR
timeuuid VARCHAR