Creating configuration files for DataStax Bulk Loader

You can pass all DataStax Bulk Loader options as command line arguments or in configuration files. Using one or more configuration files is often easier than passing all configuration options via the command line.

By default, the configuration files are located under the DataStax Bulk Loader conf directory. The main configuration file is named application.conf. The default location can be changed via the -f switch on the dsbulk command line.

DataStax Bulk Loader provides a default, initially empty application.conf file that you can customize for your environment. DataStax Bulk Loader also provides a template configuration file that can serve as a starting point for further customization. Template configuration files for DataStax Bulk Loader and the Java driver are also included in the tarball, under the ./manual folder.

Configuration files must comply with HOCON syntax. This syntax is flexible and allows sections to be grouped together in blocks. For example:

dsbulk {
  connector {
    name = "csv"
      csv {
        url = "C:\\Users\\My Folder"
        delimiter = "\t"
      }
   }
 }

The example above is equivalent to the following snippet using dotted notation instead of blocks:

dsbulk.connector.name = "csv"
dsbulk.connector.csv.url = "C:\\Users\\My Folder"
dsbulk.connector.csv.delimiter = "\t"

You can split your configuration in more than one file using file inclusions. For details, see the HOCON documentation.

The default configuration file includes another file called driver.conf, also located in the DataStax Bulk Loader conf directory. Use the driver.conf file to configure the DataStax Java Driver for DataStax Bulk Loader. This file is initially empty as well; you can customize it to your needs. A driver template configuration file can serve as a starting point for further customization; the driver template file is also included in the DataStax Bulk Loader tarball, under the ./manual folder

Important caveats

In configuration files for DataStax Bulk Loader:

  • You cannot omit the prefix dsbulk. For example, to select the connector to use in a configuration file, use dsbulk.connector.name = csv, as in the example above; on the command line, however, you can use --dsbulk.connector.name csv or --connector.name csv to achieve the same effect.

  • You cannot abbreviate the prefix datastax-java-driver to driver. For example, to select the consistency level to use in a configuration file, use datastax-java-driver.basic.request.consistency = QUORUM in a configuration file; on the command line, however, you can use both --datastax-java-driver.basic.request.consistency = QUORUM or --driver.basic.request.consistency = QUORUM to achieve the same effect.

Specifying options through the command line override those options specified in the configuration files.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com