Importing data into a CQL list or set (deprecated)

Use the cql-import tool to map SQL columns to items in a collection set, list, or map.

Note: Hadoop is deprecated for use with DataStax Enterprise. DSE Hadoop and BYOH (Bring Your Own Hadoop) are deprecated. Sqoop is also deprecated and will be removed when Hadoop is removed.

DataStax Enterprise supports importing data into a CQL collection using the cql-import tool. You can use the cql-import tool to map SQL columns to items in a collection set, list, or map.

The cql-import tool supports two distinct mechanisms for importing data into list and set data types. Both mechanisms use the --cassandra-column-mapping parameter.

Mapping multiple SQL columns in a single row to a CQL list or set 

The cql-import command supports the following cassandra-column-mapping parameter for mapping multiple SQL columns in a single row to a CQL list or set.
CQLCOL:[SQLCOL1,SQLCOL2,SQLCOL3]

This form of mapping adds the SQL columns SQLCOL1,SQLCOL2, and SQLCOL3 to the list or set CQLCOL.

The following example shows how to map a MySQL table having multiple SQL columns in a single row to a CQL list.

Suppose you have created and populated a MySQL table using the following commands:
mysql> CREATE TABLE sql_table (sqlid INTEGER PRIMARY KEY,  a VARCHAR(25), b VARCHAR(25), c VARCHAR(25)); 
mysql> INSERT INTO sql_table (sqlid, a, b, c) values (1, 'valuea', 'valueb', 'valuec');
mysql> INSERT INTO sql_table (sqlid, a, b, c) values (2, 'valued', 'valuee', 'valuef');
Using cqlsh, suppose you create the following table in Cassandra that corresponds to the MySQL table:
cqlsh> CREATE TABLE cql_table (cqlid int PRIMARY KEY, mylist listtext);
The following map along with other options imports the data into CQL:
--cassandra-column-mapping cqlid:sqlid,mylist:[a,b,c]
Note that there are no spaces after the comma (,) or the colon (:) in cqlid:sqlid,mylist:[a,b,c].
Querying Cassandra to select the table produces the following output:
 id  | mylist                       
-----+-----------------------------
 1   | {'valuea','valueb','valuec'}
 2   | {'valued','valuee','valuef'}

Mapping a single SQL column from multiple SQL rows to a CQL list or set 

The cql-import command also supports the following cassandra-column-mapping parameter to map a single SQL column from multiple SQL rows to a CQL list or set.
CQLCOL:SQLCOL

This form of mapping appends SQL column values from multiple SQL rows that share a common key to a CQL list or set.

The following example shows how to map a MySQL table having a single SQL column from multiple SQL rows to a CQL list.

Suppose you have created and populated a MySQL table using the following commands:
mysql> CREATE TABLE sql_table (sqlid INTEGER PRIMARY KEY,  id INTEGER, a VARCHAR(25)); 
mysql> INSERT INTO sql_table (sqlid, id, a) values (1, 1, 'valuea');
mysql> INSERT INTO sql_table (sqlid, id, a) values (2, 1, 'valueb');
mysql> INSERT INTO sql_table (sqlid, id, a) values (3, 1, 'valuec');
mysql> INSERT INTO sql_table (sqlid, id, a) values (4, 2, 'valued');
mysql> INSERT INTO sql_table (sqlid, id, a) values (5, 2, 'valuee');

Using cqlsh, suppose you create the following table in Cassandra that corresponds to the MySQL table:

cqlsh> CREATE TABLE cql_table (cqlid int PRIMARY KEY, mylist listtext);
The following map along with other options imports the data into CQL:
--cassandra-column-mapping cqlid:id,mylist:a
Note that there are no spaces after the comma (,) or the colon (:) in cqlid:id,mylist:a.
Querying Cassandra to select the table produces the following output:
 id  | mylist                       
-----+-----------------------------
 1   | {'valuea','valueb','valuec'}
 2   | {'valued','valuee'}