Installing DataStax Bulk Loader 1.4.1

Install DataStax Bulk Loader to efficiently load and unload CSV/JSON data to/from: DSE, DDAC, DataStax Apollo, and Apache Cassandra databases.

DataStax Bulk Loader lets you efficiently and reliably load and unload CSV/JSON data in and out of:
  • DataStax Enterprise (DSE) 4.7 and later databases
  • Open source Apache Cassandra™ 2.1 and later databases
  • DataStax Distribution of Apache Cassandra (DDAC) databases
  • DataStax Apollo cloud databases
DataStax recommends using the latest dsbulk version, which is currently 1.4.1.

DataStax Bulk Loader is supported on Linux, macOS, and Windows platforms.

You can use DataStax Bulk Loader as a standalone tool that connects remotely to a cluster. The tool is not required to run locally on a cluster node, but can be used in this configuration.
Attention: Before upgrading to DataStax Bulk Loader 1.4.0 and later releases, note that 1.4.0 added support for the latest 2.x version of the DataStax Java driver. Many new driver options are available directly with dsbulk commands via the datastax-java-driver prefix. If you are using a pre-4.7 DSE release, the new driver options are not supported and you must use or remain on DataStax Bulk Loader 1.3.4.

For open source Apache Cassandra 2.1 and later databases, DataStax Bulk Loader 1.4.1 added support for load and count operations; previous DataStax Bulk Loader releases supported unload operations only.

Procedure

Important: End User License Agreement (EULA). By downloading DataStax products, you confirm that you agree to the processing of information as described in the DataStax website privacy policy and agree to the website terms of use.
  1. Download the tarball or zip file from the DataStax Bulk Loader download page. Select the package for your OS: A tar file is provided for Linux and macOS; a zip file is provided for Windows.
  2. If you agree, enable the Terms checkbox and click the Download button.
  3. Unpack the distribution. Linux example:
    tar -xzvf dsbulk-1.4.1.tar.gz

    The files are downloaded and extracted into the current directory.

What's next

If you previously used a package install of DSE or DDAC on the node where you just installed dsbulk, a prior version of dsbulk was included, such as 1.2.0 or 1.3.0. After unpacking the latest version of dsbulk from the standalone tarball, update your PATH so that it points to the new version.

For example, on a macOS node, edit your $HOME/.bashrc file, adding a command such as:
export PATH=path-to-unpacked-location/dsbulk-1.4.1/bin:$PATH
From the command line, execute your updated .bashrc, and verify the dsbulk version. Example:
source ~/.bashrc
dsbulk --version
DataStax Bulk Loader v1.4.1
Next, learn how to get started with dsbulk.