About Sqoop (deprecated)

(Deprecated) DSE Hadoop supports Sqoop for migrating data and supports password authentication for Sqoop operations.

Note: Hadoop is deprecated for use with DataStax Enterprise. DSE Hadoop and BYOH (Bring Your Own Hadoop) are deprecated. Sqoop is also deprecated and will be removed when Hadoop is removed.
DSE Hadoop supports Sqoop, an Apache Software Foundation tool for transferring data between an RDBMS data source and Hadoop or between other data sources, such as NoSQL. DataStax Enterprise supports the following operations:
  • Import and export data to and from CQL tables and any JDBC-compliant data source.
  • Import SQL files into a CQL collection set, list, and map.
  • Import data into CQL using a re-useable, file-based import command.
  • Import legacy data using the thrift-import tool that supports backward compatibility with earlier DataStax Enterprise versions.
  • Use conventional Sqoop commands to import data into the Cassandra File System (CFS), the counterpart to HDFS, instead of a CQL table.

You can import and export MySQL, PostgreSQL, and Oracle data types that are listed in the Sqoop reference. An analytics node runs the MapReduce job that imports and exports data from a data source using Sqoop. You need a JDBC driver for the RDBMS or other type of data source.

Importing data

You can import data from any JDBC-compliant data source. For example:

  • DB2
  • MySQL

    If you encounter SQOOP-1400, use the latest version of the MySQL Connector/J (mysql-connector-java).

  • Oracle
  • SQL Server
  • Sybase

Securing Sqoop

DataStax Enterprise supports password authentication for Sqoop operations. Configure password authentication using Cassandra-specific properties. Kerberos is also supported. Client-to-node encryption (SSL) is supported for Sqoop-imported and exported data.