Setting up Kerberos client authentication

When loading data into a Kerberos enabled DataStax Enterprise (DSE), Hyper-Converged Database (HCD), or DataStax Distribution of an Apache Cassandra® cluster, DataStax Bulk Loader must provide Kerberos credentials using one of the following methods:

Configuring the location of the Kerberos Configuration file

Authenticating with Kerberos credentials using a keytab file or ticket cache requires the Kerberos configuration file (krb5.conf). Typically, this file is in the /etc directory. If it is not there, obtain one from your Kerberos system administrator.

To use a location other than /etc, set the environment variables for Kerberos command line tools, such as kinit, klist, and kdestroy and DS Bulk Loader.

Procedure

If the Kerberos configuration file is not the default location (/etc), set the path to the file using the environment variables:

  1. Set the KRB5_CONFIG environment variable to the location of krb5.conf.

    The following shows an example of the file location: $JAVA_HOME/lib/security/krb5.conf.

    EXPORT KRB5_CONFIG=$JAVA_HOME/lib/security/krb5.conf
  2. Add the path to DSBULK_JAVA_OPTS.

    The following shows an example of the file location: $JAVA_HOME/lib/security/krb5.conf.

    EXPORT DSBULK_JAVA_OPTS=$DSBULK_JAVA_OPTS -Djava.security.krb5.conf=$JAVA_HOME/lib/security/krb5.conf

Using a Kerberos Keytab file for authentication

Use a keytab file to get credentials for authentication with a DataStax Enterprise cluster.

To use a Kerberos keytab file, first use the kadmin command to create the keytab file and get a ticket.

Prerequisites

These steps require MIT Kerberos tools:

Procedure

  1. Create a keytab file with kadmin.

    1. Start kadmin:

      kadmin
    2. Create file:

      ktadd -k file_name principal_name
    3. Login using kinit:

      kinit -k -t file_name principal_name
  2. Authenticate from Bulk Loader using the ticket:

    • To configure Bulk Loader to use the keytab file, in the application.conf set:

      • driver.auth.provider to DseGSSAPIAuthProvider.

      • driver.auth.principal to the principal name.

      • driver.auth.keyTab to keytab file using the full path. If multiple principals may have valid tickets in the ticket cache, DSBulk arbitrarily chooses one to use. Specify the principal explicitly by setting the driver.auth.principal to the principal name.

        For example:

        ############ MyConfFile.conf ############
        
        dsbulk {
           # The name of the connector to use
           connector.name = "csv"
           # CSV field delimiter
           connector.csv.delimiter = "|"
           # The keyspace to connect to
           schema.keyspace = "myKeyspace"
           # The table to connect to
           schema.table = "myTable"
           # The field-to-column mapping
           schema.mapping = "0=name, 1=age, 2=email"
           # The authentication configuration for Kerberos
           driver.auth.provider="DseGSSAPIAuthProvider"
           driver.auth.principal="principal_name"
           driver.auth.keyTab="file_path"
        }

        Additional command line parameters are not required when using this option.

    • Specify Kerberos options on the command line:

      dsbulk load -k ks -t t1 -url ~/data.csv \
      --driver.auth.provider DseGSSAPIAuthProvider \
      --driver.auth.principal dsbulk_principal_name \
      --driver.auth.keyTab file_path

Using a Kerberos Ticket Cache for authentication

Use a ticket cache to authenticate with a DataStax Enterprise cluster.

To use the Kerberos ticket cache, first use the kinit command to authenticate with the Kerberos server and obtain a ticket.

Prerequisites

These steps require MIT Kerberos tools:

Procedure

  1. Get a Kerberos ticket:

    1. Authenticate with the Kerberos server and obtain a ticket:

      kinit principal_name@REALM
    2. Verify the ticket and expiration:

      klist

      One or more tickets display in the list with the expiration time.

      Ticket cache: FILE:/tmp/krb5cc_1002
      Default principal: principal_name@REALM
      
      Valid starting       Expires              Service principal
      02/14/2020 21:53:51  02/15/2020 07:53:51  krbtgt/host@REALM
      	renew until 02/15/2020 21:53:49
  2. Authenticate from Bulk Loader using the ticket:

    • To configure Bulk Loader to use a ticket in the cache, in the application.conf set the driver.auth.provider to DseGSSAPIAuthProvider.

      If multiple principals may have valid tickets in the ticket cache, DSBulk arbitrarily chooses one to use. Specify the principal explicitly by setting the driver.auth.principal to the principal name.

      For example:

      ############ MyConfFile.conf ############
      
      dsbulk {
         # The name of the connector to use
         connector.name = "csv"
         # CSV field delimiter
         connector.csv.delimiter = "|"
         # The keyspace to connect to
         schema.keyspace = "myKeyspace"
         # The table to connect to
         schema.table = "myTable"
         # The field-to-column mapping
         schema.mapping = "0=name, 1=age, 2=email"
         # The authentication provider for Kerberos
         driver.auth.provider="DseGSSAPIAuthProvider"
         driver.auth.principal="principal_name"
      }

      Additional command line parameters are not required when using this option.

    • Specify Kerberos options on the command line:

      • Use any cached ticket:

        dsbulk load -k ks -t t1 -url ~/data.csv \
        --driver.auth.provider DseGSSAPIAuthProvider
      • Use a specific principal when more than one ticket is cached:

        dsbulk load -k ks -t t1 -url ~/data.csv \
        --driver.auth.provider DseGSSAPIAuthProvider --driver.auth.principal dsbulk_principal_name

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2025 DataStax | Privacy policy | Terms of use | Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com