• Glossary
  • Support
  • Downloads
  • DataStax Home
Get Live Help
Expand All
Collapse All

DataStax Bulk Loader

    • About DataStax Bulk Loader
    • Release notes
    • Architecture
    • Installing
    • Getting Started
      • Loading data
      • Unloading data
      • Counting data in tables
      • Creating configuration files
      • Loading tables that contain static and non-static columns
      • Using SSL with dsbulk
      • Printing cluster information
    • Kerberos client authentication
    • Reference
      • dsbulk
        • Loading data examples
        • Unloading data examples
        • Counting data example
        • Exit codes
      • Common options
      • Connector options
      • Count options
      • Schema options
      • Batch options
      • Codec options
      • Driver options
      • Engine options
      • Executor options
      • Logging options
      • Monitoring options
  • DataStax Bulk Loader
  • Getting Started
  • Creating configuration files

Creating configuration files for DataStax Bulk Loader

You can pass all DataStax Bulk Loader options as command line arguments or in configuration files. Using one or more configuration files is often easier than passing all configuration options via the command line.

By default, the configuration files are located under the DataStax Bulk Loader conf directory. The main configuration file is named application.conf. The default location can be changed via the -f switch on the dsbulk command line.

DataStax Bulk Loader provides a default, initially empty application.conf file that you can customize for your environment. DataStax Bulk Loader also provides a template configuration file that can serve as a starting point for further customization. Template configuration files for DataStax Bulk Loader and the Java driver are also included in the tarball, under the ./manual folder.

Configuration files must comply with HOCON syntax. This syntax is flexible and allows sections to be grouped together in blocks. For example:

dsbulk {
  connector {
    name = "csv"
      csv {
        url = "C:\\Users\\My Folder"
        delimiter = "\t"
      }
   }
 }

The example above is equivalent to the following snippet using dotted notation instead of blocks:

dsbulk.connector.name = "csv"
dsbulk.connector.csv.url = "C:\\Users\\My Folder"
dsbulk.connector.csv.delimiter = "\t"

You can split your configuration in more than one file using file inclusions. For details, see the HOCON documentation.

The default configuration file includes another file called driver.conf, also located in the DataStax Bulk Loader conf directory. Use the driver.conf file to configure the DataStax Java Driver for DataStax Bulk Loader. This file is initally empty as well; you can customize it to your needs. A driver template configuration file can serve as a starting point for further customization; the driver template file is also included in the DataStax Bulk Loader tarball, under the ./manual folder

Important caveats

In configuration files for DataStax Bulk Loader:

  • You cannot omit the prefix dsbulk. For example, to select the connector to use in a configuration file, use dsbulk.connector.name = csv, as in the example above; on the command line, however, you can use --dsbulk.connector.name csv or --connector.name csv to achieve the same effect.

  • You cannot abbreviate the prefix datastax-java-driver to driver. For example, to select the consistency level to use in a configuration file, use datastax-java-driver.basic.request.consistency = QUORUM in a configuration file; on the command line, however, you can use both --datastax-java-driver.basic.request.consistency = QUORUM or --driver.basic.request.consistency = QUORUM to achieve the same effect.

Specifying options through the command line override those options specified in the configuration files.

Counting data in tables Loading tables that contain static and non-static columns

General Inquiries: +1 (650) 389-6000 info@datastax.com

© DataStax | Privacy policy | Terms of use

DataStax, Titan, and TitanDB are registered trademarks of DataStax, Inc. and its subsidiaries in the United States and/or other countries.

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.

Kubernetes is the registered trademark of the Linux Foundation.

landing_page landingpage