Importing SQL to a CQL table or CFS (deprecated)

Steps for importing SQL data into a CQL table or the Cassandra File System.

Note: Hadoop is deprecated for use with DataStax Enterprise. DSE Hadoop and BYOH (Bring Your Own Hadoop) are deprecated. Sqoop is also deprecated and will be removed when Hadoop is removed.

To import data to CQL, the keyspace and CQL table must exist prior to the importation. If the CQL table contains data prior to the importation, cql-import updates the data. The Sqoop demo shows how to import SQL data into a CQL table.

In addition to importing data to a CQL table, you can also import data to the Cassandra File System (CFS). The CFS is the Cassandra counterpart to the Hadoop Distributed File System (HDFS). The example in this section shows how to import SQL data to CFS. Using Hive and other utilities, you can access the CFS data.

Procedure

Follow the steps in the Sqoop demo to create the SQL database and table and the CQL keyspace and table.
Use the dse sqoop import command to migrate the data from the MySQL table to text files in the directory npa_nxx in the CFS. Use the database username and password. If the database account is not password-protected, just omit the password option. On Linux, for example:
```
$ bin/dse sqoop import --connect
    jdbc:mysql://127.0.0.1/npa_nxx_demo
    --username mysql
    --password password
    --table npa_nxx
    --target-dir /npa_nxx
```
DataStax Enterprise returns this message:
```
INFO mapreduce.ImportJobBase: Retrieved 105291 records.
```
Use the command to view the results in the CFS. On Linux, for example:
```
$ bin/dse hadoop fs -ls /npa_nxx
```
Depending on the number of DataStax Enterprise analytics nodes and task tracker configuration, the output shows a number of files in the directory, part-m-0000n, where n ranges from 0 to the number of tasks that were executed as part of the Hadoop job.
To view the contents of these files, use this hadoop fs command:
```
$ bin/dse hadoop fs -cat /npa_nxx/part-m-00000
```
By varying the number of tasks (the 00000), the output looks something like this:
```
361991,361,991,27.73,097.40,L,TX,Corpus Christi
361992,361,992,27.73,097.40,L,TX,Corpus Christi
361993,361,993,27.73,097.40,L,TX,Corpus Christi
361994,361,994,27.73,097.40,L,TX,Corpus Christi
361998,361,998,27.79,097.90,L,TX,Agua Dulce
361999,361,999,27.80,097.40,W,TX,Padre Island National Seashore
```
As shown in the output, the CSV file format that Sqoop requires does not include optional spaces in the delimiter.