# Driver options

DataStax Java driver options for the dsbulk command.

This topic describes a commonly-used subset of DataStax Java driver options that you can specify with the dsbulk command. Many additional options exist. Be sure to read the DataStax Java driver configuration reference documentation. Also refer to the driver matrix.

The options can be used in short form (-h host_name) or long form (--datastax-java-driver.basic.contact-point host_name).

Tip: DataStax Java driver configuration settings start with the prefix datastax-java-driver. On the dsbulk command line, you can abbreviate this prefix to driver, if you prefer.

## General options

Specify general options for using dsbulk with the DataStax Java driver. Use these options to define the contact points and port number for the initial connection. Additionally, define policy options pertaining to the DataStax Java driver load balancing policy settings, pooling options, query options, and socket connections.

-h, --driver.basic.contact-points, --datastax-java-driver.basic.contact-points host_name(s)

The contact points to use for the initial connection to the cluster.

These are addresses of Cassandra nodes that the driver uses to discover the cluster topology. Only one contact point is required (the driver will retrieve the address of the other nodes automatically), but it is usually a good idea to provide more than one contact point. If a single contact point is unavailable, the driver cannot initialize itself correctly.

This must be a list of strings with each contact point specified as host or host:port. If the host is specified without a port, the default port specified in basic.default-port will be used. Apache Cassandra 3.0 and earlier and DataStax Enterprise (DSE) 6.7 and earlier require all nodes in a cluster to share the same port.

Valid examples of contact points are:
• IPv4 addresses with ports: [ "192.168.0.1:9042", "192.168.0.2:9042" ]
• IPv4 addresses without ports: [ "192.168.0.1", "192.168.0.2" ]
• IPv6 addresses with ports: [ "fe80:0:0:0:f861:3eff:fe1d:9d7b:9042", "fe80:0:0:f861:3eff:fe1d:9d7b:9044:9042" ]
• IPv6 addresses without ports: [ "fe80:0:0:0:f861:3eff:fe1d:9d7b", "fe80:0:0:f861:3eff:fe1d:9d7b:9044" ]
• Host names with ports: [ "host1.com:9042", "host2.com:9042" ]
• Host names without ports: [ "host1.com", "host2.com:" ]
If the host is a DNS name that resolves to multiple A-records, all the corresponding addresses will be used. Do not use localhost as a host-name (because it resolves to both IPv4 and IPv6 addresses on some platforms). The port for all hosts must be specified with driver.port.
Note: Be sure to enclose address strings that contain special characters in quotes, as shown in these examples:
dsbulk unload -h '["fe80::f861:3eff:fe1d:9d7a"]' -query "SELECT * from foo.bar;"
dsbulk unload -h '["fe80::f861:3eff:fe1d:9d7b","fe80::f861:3eff:fe1d:9d7c"]'
-query "SELECT * from foo1.bar1;"
The heuristic to determine whether a contact point is in the form "host" or "host:port" is not 100% accurate for some IPv6 addresses; avoid ambiguous IPv6 addresses such as fe80::f861:3eff:fe1d:1234 because such a string could be interpreted as a combination of IP fe80::f861:3eff:fe1d with port 1234, or as IP fe80::f861:3eff:fe1d:1234 without port. In such cases, DataStax Bulk Loader for Apache Cassandra will not change the contact point. To avoid this issue, provide IPv6 addresses in full form. For example, instead of fe80::f861:3eff:fe1d:1234, provide fe80:0:0:0:0:f861:3eff:fe1d:1234, so that the string is parsed as IP fe80:0:0:0:0:f861:3eff:fe1d with port 1234.
Attention: On cloud deployments, DataStax Bulk Loader for Apache Cassandra automatically sets this option to an empty list, because contact points are not allowed to be explicitly provided when connecting to DataStax Astra databases.

Default: 127.0.0.1

-port, --driver.basic.default-port, --datastax-java-driver.basic.default-port port_number

The port to use for basic.contact-points, when a host is specified without a port. All nodes in a cluster must accept connections on the same port number.

Default: 9042

-b, --driver.basic.cloud.secure-connect-bundle, --datastax-java-driver.basic.cloud.secure-connect-bundle string
The location of the secure bundle used to connect to a cloud-based DataStax Astra database. This setting must be a path on the local filesystem or a valid URL. Examples:
"/path/to/bundle.zip"        # path on Linux or macOS
"./path/to/bundle.zip"       # path on Linux or macOS relative to working directory
"~/path/to/bundle.zip"       # path on Linux or macOS relative to home directory
"C:\\path\\to\\bundle.zip"   # path on Windows; escape backslashes
"file:/a/path/to/bundle.zip" # URL with file protocol
"http://host.com/bundle.zip" # URL with HTTP protocol
Note: DataStax Astra Open Beta participants can download the secure connect bundle from the DataStax Cloud console after creating an Astra database. The secure-connect-bundle option is only for Astra databases. Do not use the following options when connecting to cloud-based Astra deployments:
• datastax-java-driver.basic.contact-points
• datastax-java-driver.basic.request.consistency
• datastax-java-driver.advanced.ssl-engine-factory.*

Default: null

-cl,--driver.basic.request.consistency, --datastax-java-driver.basic.request.consistency string

The consistency level to use for all queries. Note that stronger consistency levels usually result in reduced throughput. In addition, any level higher than ONE will automatically disable continuous paging, which can dramatically reduce read throughput.>

Valid values are: ANY, LOCAL_ONE, ONE, TWO, THREE, LOCAL_QUORUM, QUORUM, EACH_QUORUM, ALL.
Note: On cloud deployments, the only accepted consistency level when writing is LOCAL_QUORUM. Therefore, the default value is LOCAL_ONE, except when loading in cloud deployments, in which case the default is automatically changed to LOCAL_QUORUM.

Default: LOCAL_ONE

--driver.basic.request.timeout, --datastax-java-driver.basic.request.timeout "string"

How long the DataStax Java driver waits for a request to complete. This is a global limit on the duration of a session.execute() call, including any internal retries the driver might do. By default, this value is set very high because DataStax Bulk Loader is optimized for good throughput, rather than good latencies.

Default: "5 minutes"

--driver.basic.request.default-idempotence, --datastax-java-driver.basic.request.default-idempotence {true | false}

The default idempotence for all queries executed in DataStax Bulk Loader. Setting this option to false causes all unload failures to not be retried.

Default: true

--driver.basic.request.serial-consistency, --datastax-java-driver.basic.request.serial-consistency string

The serial consistency level to use during unload operations. Possible options are LOCAL_SERIAL or SERIAL.

Default: LOCAL_SERIAL

--driver.basic.request.page-size, --datastax-java-driver.basic.request.page-size number

The page size. This controls how many rows will be retrieved simultaneously in a single network roundtrip (the goal being to avoid loading too many results in memory at the same time). If there are more results, additional requests will be used to retrieve them (either automatically if you iterate with the sync API, or explicitly with the async API's fetchNextPage method). If the value is 0 or negative, it will be ignored and the request will not be paged.

Default: 5000

The load balancing policy class to use. If not qualified, the DataStax Java driver assumes that it resides in the package com.datastax.oss.driver.internal.core.loadbalancing. DataStax Bulk Loader uses a special policy that infers the local datacenter from the contact points. You can also specify a custom class that implements LoadBalancingPolicy and has a public constructor with two arguments: the DriverContext and a String representing the profile name.

Default: "com.datastax.oss.driver.internal.core.loadbalancing.DcInferringLoadBalancingPolicy"

An optional custom filter to include or exclude nodes. If present, the option must be the fully-qualified name of a class that implements java.util.function.Predicate<Node>, and has a public constructor taking a single DriverContext argument. The predicate's test(Node) method will be invoked each time the policy processes a topology or state change. If the method returns false, the node will be set at distance IGNORED, which means the Java driver will not ever connect to it, and the node is never included in any query plan.

By default, DataStax Bulk Loader for Apache Cassandra provides a node filter implementation that honors the following settings:
• datastax-java-driver.basic.load-balancing-policy.filter.allow: a list of host names or host addresses that should be allowed.
• datastax-java-driver.basic.load-balancing-policy.filter.deny: a list of host names or host addresses that should be denied.

Default: "com.datastax.oss.dsbulk.workflow.commons.policies.lbp.SimpleNodeFilter"

An optional list of host names or host addresses that should be allowed to connect. See datastax-java-driver.basic.contact-points for a full description of accepted formats. This option only has effect when the setting datastax-java-driver.basic.load-balancing-policy.filter.class refers to the DataStax Bulk Loader default node filter implementation: com.datastax.oss.dsbulk.workflow.commons.policies.lbp.SimpleNodeFilter.
Note: Not compatible with DataStax Astra databases.

Default: []

An optional list of host names or host addresses that should be denied the ability to connect. See datastax-java-driver.basic.contact-points for a full description of accepted formats. This option only has effect when the setting datastax-java-driver.basic.load-balancing-policy.filter.class refers to the DataStax Bulk Loader default node filter implementation: com.datastax.oss.dsbulk.workflow.commons.policies.lbp.SimpleNodeFilter.
Note: Not compatible with DataStax Astra databases.

Default: []

The datacenter that is considered local. The default load balancing policy only includes nodes from this datacenter in its query plans. Set this to a value if you want to declare the local datacenter; otherwise, the DcInferringLoadBalancingPolicy that DataStax Bulk Loader uses by default infers the local datacenter from the provided contact points.

Default: unspecified

How many times to retry a failed query. Only valid for use with the DataStax Bulk Loader default retry policy (MultipleRetryPolicy).

Default: 10

## Authorization options

Specify authorization options for using dsbulk with the DataStax Java driver.

The class of the authentication provider. If it is not qualified, the Java driver assumes that it resides in one of the following packages:
• com.datastax.oss.driver.internal.core.auth
• com.datastax.dse.driver.internal.core.auth
The DSE driver provides implementations out of the box:
• PlainTextAuthProvider: uses plain-text credentials. It requires the username and password options. Should be used only when authenticating against Apache Cassandra® clusters; not recommended when authenticating against DSE clusters.
• DsePlainTextAuthProvider: provides SASL authentication using the PLAIN mechanism for DSE clusters secured with DseAuthenticator. It requires the username and password options, and optionally, an authorization-id.
You can also specify a custom class that implements AuthProvider and has a public constructor with a DriverContext argument; to simplify this step, the Java driver provides two abstract classes that can be extended: DsePlainTextAuthProviderBase and DseGssApiAuthProviderBase.

Default: null

The username to use. Providers that accept this setting:
• PlainTextAuthProvider
• DsePlainTextAuthProvider
Important: DataStax recommends specifying username and password credentials in a configuration file, instead of on the command line. For an example, refer to Creating configuration files for DataStax Bulk Loader.

Default: null

The password to use. Providers that accept this setting:
• PlainTextAuthProvider
• DsePlainTextAuthProvider
Important: DataStax recommends specifying username and password credentials in a configuration file, instead of on the command line. For an example, refer to Creating configuration files for DataStax Bulk Loader.

Default: null

An authorization ID allows the currently authenticated user to act as a different user (proxy authentication). Providers that accept this setting:
• DsePlainTextAuthProvider
• DseGssApiAuthProvider

Default: null

## SSL options

Specify SSL encryption options for using dsbulk with the DataStax Java driver. For additional information on SSL, see the Oracle Java Guide on SSL.

The class of the SSL engine factory. If not qualified, the DataStax Java driver assumes that it resides in the package com.datastax.oss.driver.internal.core.ssl. The DataStax Java driver provides a single implementation DefaultSslEngineFactory, which uses the JDK's built-in SSL implementation.

You can also specify a custom class that implements SslEngineFactory and has a public constructor with a DriverContext argument.

Default: null

Whether to require validation that the hostname of the server certificate's common name matches the hostname of the server being connected to. This setting is only required when using the default SSL factory. If not set, defaults to true.

Default: true

The locations used to access truststore contents. If either truststore-path or keystore-path are specified, the DataStax Java driver builds an SSLContext from these files. This setting is only required when using the default SSL factory. If neither option is specified, the default SSLContext is used, which is based on system property configuration.

Default: null

The password used to access truststore contents. This setting is only required when using the default SSL factory.

Default: null

The locations used to access keystore contents. If either truststore-path or keystore-path are specified, the DataStax Java driver builds an SSLContext from these files. This setting is only required when using the default SSL factory. If neither option is specified, the default SSLContext is used, which is based on system property configuration.

Default: null

The password used to access keystore contents. This setting is only required when using the default SSL factory.

Default: null

## Continuous paging options

Attention: Continuous paging options only take effect if continuous paging is globally enabled, which can be done with the executor option dsbulk.executor.continuousPaging.enabled.

Set the page size.The value can be interpreted in number of rows or in number of bytes, depending on the page-size-in-bytes boolean value. This page size option controls how many rows (or how much data) is retrieved simultaneously in a single network roundtrip. The goal is to avoid loading too many results in memory at the same time. If there are more results, additional requests are used to retrieve them automatically (if you iterate with the sync API), or explicitly with the async API's fetchNextPage method. The default is the same as the driver's normal request page size: 5000 (rows).

Default: 5000

Whether the page-size option should be interpreted in number of rows or bytes. The default of false means page size is interpreted as the number of rows.

Default: false

The maximum number of pages to return. The default of zero means retrieve all pages.

Default: 0

Sets the maximum number of pages per second. The default of zero means no limit.

Default: 0

The maximum number of pages that can be stored in the local queue. This value must be positive.

Default: 4

How long to wait for the DataStax Bulk Loader coordinator to the first page.

Default: "5 minutes"

How long to wait for the DataStax Bulk Loader coordinator to send subsequent pages.

Default: "5 minutes"

Specify advanced options for using dsbulk with the DataStax Java driver.

The native protocol version to use. If not set, the DataStax Java driver looks up the versions of the nodes at startup (by default, system.peers.release_version) and chooses the highest common protocol version.

For example, if you have a mixed cluster with Apache Cassandra 2.1 nodes (protocol v3) and Apache Cassandra 3.0 nodes (protocol v3 and v4), the driver chooses protocol v3. If the nodes do not have a common protocol version, initialization fails. If this option is set, the given version is used for all connections without any negotiation or downgrading. If any of the contact points do not support the protocol version, that contact point is skipped. Once the protocol version is set, it cannot change for the duration of the driver's session. If an incompatible node joins the cluster later, the connection will fail and the driver will not try to reconnect to the node.

Default: null

The name of the algorithm used to compress protocol frames. Possible values are: lz4, snappy or none.

Default: none

The number of connections in the pool for nodes considered as local.

Default: 8

The number of connections in the pool for nodes considered as remote. The default load balancing policy used by DataStax Bulk Loader does not consider remote nodes. As a result, this setting has no effect when using the default load balancing policy.

Default: 8

The maximum number of requests that can be executed concurrently on a connection. Applies to local or remote connections. Must be a number between 1 and 32768.

Default: 32768

Whether to resolve the addresses passed to basic.contact-points.
• If true, addresses are created with InetSocketAddress(String, int). The host name is resolved the first time, and the driver will use the resolved IP address for all subsequent connection attempts.
• If false, addresses are created with InetSocketAddress.createUnresolved(). the host name will be resolved again every time the driver opens a new connection. This is useful for containerized environments where DNS records are more likely to change over time.
Note: JVM and OS have their own DNS caching mechanisms, so you might need additional configuration beyond the driver.
This option only applies to the contact points specified in the configuration. It has no effect on dynamically discovered peers. The driver relies on Cassandra system tables, which expose raw IP addresses. Use a custom address translator (see advanced.address-translator.class) to convert them to unresolved addresses; if you're in a containerized environment, you probably already need address translation.

Default: true

The class of the microsecond timestamp generator. If it is not qualified, the driver assumes that it resides in the package com.datastax.oss.driver.internal.core.time. The driver provides the following implementations out of the box:
• AtomicTimestampGenerator: timestamps are guaranteed to be unique across all client threads.
• ThreadLocalTimestampGenerator: timestamps that are guaranteed to be unique within each thread only.
• ServerSideTimestampGenerator: do not generate timestamps, let the server assign them.
You can also specify a custom class that implements TimestampGenerator and has a public constructor with two arguments: the DriverContext and a String representing the profile name.

Default: "AtomicTimestampGenerator"

The class of the translator. If not qualified, the DataStax Java driver assumes that it resides in the package com.datastax.oss.driver.internal.core.addresstranslation. The DataStax Java driver provides the PassThroughAddressTranslator implementation, which returns all addresses unchanged. You can also specify a custom class that implements AddressTranslator and has a public constructor with a DriverContext argument.

Default: "PassThroughAddressTranslator"

The heartbeat interval. If a connection stays idle for that duration (there are no reads), the DataStax Java driver sends a dummy message on it to make sure it's still alive. If not, the connection is closed and replaced.

Default: "30 seconds"

How long the DataStax Java driver waits for the response to a heartbeat. If this timeout occurs, the heartbeat is considered failed.

Default: "60 seconds"

## Deprecated options

Deprecated. The correct option to use is --datastax-java-driver.basic.request.timeout.

--driver.timestampGenerator { AtomicMonotonicTimestampGenerator | ThreadLocalTimestampGenerator | ServerSideTimestampGenerator }

Deprecated. The correct option to use is --datastax-java-driver.advanced.timestamp-generator.class.

-lbp,--driver.policy.lbp.name { dse | dcAwareRoundRobin | roundRobin | whiteList | tokenAware }

Deprecated. The correct option to use is --datastax-java-driver.basic.load-balancing-policy.class.

--driver.policy.lbp.dcAwareRoundRobin.allowRemoteDCsForLocalConsistencyLevel {true | false}

Deprecated. There is no equivalent for this obsolete option.

--driver.policy.lbp.dcAwareRoundRobin.localDc string

Deprecated. The correct option to use is --datastax-java-driver.basic.load-balancing-policy.local-datacenter.

--driver.policy.lbp.dcAwareRoundRobin.usedHostsPerRemoteDc number

Deprecated. There is no equivalent for this obsolete option.

--driver.policy.lbp.dse.childPolicy { dse | dcAwareRoundRobin | roundRobin | whiteList | tokenAware }

Deprecated. There is no equivalent for this obsolete option.

--driver.policy.lbp.tokenAware.childPolicy { dse | dcAwareRoundRobin | roundRobin | whiteList | tokenAware }

Deprecated. There is no equivalent for this obsolete option.

--driver.policy.lbp.tokenAware.shuffleReplicas { true | false }

Deprecated. There is no equivalent for this obsolete option.

--driver.policy.lbp.whiteList.childPolicy { dse | dcAwareRoundRobin | roundRobin | whiteList | tokenAware }

Deprecated. There is no equivalent for this obsolete option.

--driver.policy.lbp.whiteList.hosts string

Deprecated. The correct option to use is --datastax-java-driver.basic.load-balancing-policy.filter.class.

--datastax-java-driver.pooling.heartbeat string

Deprecated. The correct option to use is --datastax-java-driver.advanced.heartbeat.interval.

--driver.pooling.local.connections number

Deprecated. The correct option to use is --datastax-java-driver.advanced.connection.pool.local.size.

--driver.pooling.remote.connections number

Deprecated. The correct option to use is --datastax-java-driver.advanced.connection.pool.remote.size.

--datastax-java-driver.pooling.local.requests number

Deprecated. The correct option to use is --datastax-java-driver.advanced.connection.max-requests-per-connection.

--driver.pooling.remote.requests number

Deprecated. The correct option to use is --datastax-java-driver.advanced.connection.max-requests-per-connection.

--driver.protocol.compression string

Deprecated. The correct option to use is --datastax-java-driver.advanced.protocol.compression.

--driver.query.idempotence {true | false}

Deprecated. The correct option to use is --datastax-java-driver.basic.request.default-idempotence.

--driver.query.serialConsistency string

Deprecated. The correct option to use is --datastax-java-driver.basic.request.serial-consistency.

-maxRetries,--driver.policy.maxRetries number

Deprecated. The correct option to use is --datastax-java-driver.advanced.retry-policy.max-retries.

--driver.ssl.cipherSuites list

Deprecated. The correct option to use is --datastax-java-driver.advanced.ssl-engine-factory.class and related datastax-java-driver.advanced.ssl-engine-factory.* options.

--driver.ssl.keystore.algorithm { SunX509 | NewSunX509 }

Deprecated. The correct option to use is --datastax-java-driver.advanced.ssl-engine-factory.class and related datastax-java-driver.advanced.ssl-engine-factory.* options.

--driver.ssl.keystore.path string

Deprecated. The correct option to use is --datastax-java-driver.advanced.ssl-engine-factory.keystore-path.

--driver.ssl.openssl.keyCertChain string

Deprecated. The correct option to use is --datastax-java-driver.advanced.ssl-engine-factory.class and related datastax-java-driver.advanced.ssl-engine-factory.* options.

--driver.ssl.openssl.privateKey string

Deprecated. The correct option to use is --datastax-java-driver.advanced.ssl-engine-factory.class and related datastax-java-driver.advanced.ssl-engine-factory.* options.

--driver.ssl.provider { None | JDK | OpenSSL }

Deprecated. The correct option to use is --datastax-java-driver.advanced.ssl-engine-factory.class and related datastax-java-driver.advanced.ssl-engine-factory.* options.

--driver.ssl.truststore.algorithm { PKIX | SunX509 }

Deprecated. The correct option to use is --datastax-java-driver.advanced.ssl-engine-factory.class and related datastax-java-driver.advanced.ssl-engine-factory.* options.

--driver.ssl.truststore.path string

Deprecated. The correct option to use is --datastax-java-driver.advanced.ssl-engine-factory.truststore-path.

--driver.query.fetchSize number

Deprecated. The correct option to use is --datastax-java-driver.basic.request.page-size.

-cl,--driver.query.consistency { ANY | LOCAL_ONE | ONE | TWO | THREE | LOCAL_QUORUM | QUORUM | EACH_QUORUM | ALL }

Deprecated. The correct option to use is --datastax-java-driver.basic.request.consistency.

--driver.auth.provider { None | PlainTextAuthProvider | DsePlainTextAuthProvider | DSEGSSAPIAuthProvider }

Deprecated. The correct option to use is --datastax-java-driver.advanced.auth-provider.class.

--driver.auth.authorizationId string

Deprecated. The correct option to use is --datastax-java-driver.advanced.auth-provider.authorization-id.

--driver.auth.keyTab string

Deprecated. The correct options to use are --datastax-java-driver.advanced.auth-provider.class and related --datastax-java-driver.advanced.auth-provider.* settings.

--driver.auth.principal email

Deprecated. The correct options to use are --datastax-java-driver.advanced.auth-provider.class and related --datastax-java-driver.advanced.auth-provider.* settings.

--driver.auth.saslServicestring

Deprecated. The correct options to use are --datastax-java-driver.advanced.auth-provider.class and related --datastax-java-driver.advanced.auth-provider.* settings.