Setting up Kerberos client authentication
Load data into a Kerberos enabled cluster using DataStax Bulk Loader.
When loading data into a Kerberos enabled DataStax Enterprise cluster or DataStax Distribution of Apache Cassandra™ cluster, DataStax Bulk Loader must provide Kerberos credentials using one of the following methods:
Configuring the location of the Kerberos Configuration file
Set the location of the Kerberos configuration file when it is not in the default location.
Authenticating with Kerberos credentials using a keytab file or ticket
cache requires the Kerberos configuration file (krb5.conf
).
Typically, this file is in the /etc
directory. If it is not,
obtain one from your Kerberos system administrator.
To use a location other than /etc, set the environment variables
for Kerberos command line tools, such as kinit
,
klist
, and kdestroy
and DS Bulk Loader.
-
Set the
KRB5_CONFIG
environment variable to the location ofkrb5.conf
.The following shows an example of the file location: $JAVA_HOME/lib/security/krb5.conf.EXPORT KRB5_CONFIG=$JAVA_HOME/lib/security/krb5.conf
-
Add the path to
DSBULK_JAVA_OPTS
.The following shows an example of the file location: $JAVA_HOME/lib/security/krb5.conf.EXPORT DSBULK_JAVA_OPTS=$DSBULK_JAVA_OPTS -Djava.security.krb5.conf=$JAVA_HOME/lib/security/krb5.conf
Using a Kerberos Keytab file for authentication
Use a keytab file to get credentials for authentication with a DataStax Enterprise cluster.
To use a Kerberos keytab file, first use the kadmin
command to create the keytab file and get a ticket.
Prerequisites
These steps require MIT Kerberos tools:
-
Create a keytab file with
kadmin
. -
Authenticate from Bulk Loader using the ticket:
- To configure Bulk Loader to use the keytab file, in the application.conf set:
- driver.auth.provider to
DseGSSAPIAuthProvider
. driver.auth.principal
to the principal name.driver.auth.keyTab
to keytab file using the full path.
If multiple principals may have valid tickets in the ticket cache, DSBulk arbitrarily chooses one to use. Specify the principal explicitly by setting the
For example:driver.auth.principal
to the principal name.############ MyConfFile.conf ############ dsbulk { # The name of the connector to use connector.name = "csv" # CSV field delimiter connector.csv.delimiter = "|" # The keyspace to connect to schema.keyspace = "myKeyspace" # The table to connect to schema.table = "myTable" # The field-to-column mapping schema.mapping = "0=name, 1=age, 2=email" # The authentication configuration for Kerberos driver.auth.provider="DseGSSAPIAuthProvider" driver.auth.principal="principal_name" driver.auth.keyTab="file_path" }
Tip: Additional command line parameters are not required when using this option. - driver.auth.provider to
- Specify Kerberos options on the command
line:
dsbulk load -k ks -t t1 -url ~/data.csv \ --driver.auth.provider DseGSSAPIAuthProvider \ --driver.auth.principal dsbulk_principal_name \ --driver.auth.keyTab file_path
Using a Kerberos Ticket Cache for authentication
Use a ticket cache to authenticate with a DataStax Enterprise cluster.
To use the Kerberos ticket cache, first use the kinit
command to authenticate with the Kerberos server and obtain a ticket.
Prerequisites
These steps require MIT Kerberos tools:
-
Get a Kerberos ticket:
-
Authenticate from Bulk Loader using the ticket:
To configure Bulk Loader to use a ticket in the cache, in the application.conf set the driver.auth.provider to
DseGSSAPIAuthProvider
.If multiple principals may have valid tickets in the ticket cache, DSBulk arbitrarily chooses one to use. Specify the principal explicitly by setting the
For example:driver.auth.principal
to the principal name.############ MyConfFile.conf ############ dsbulk { # The name of the connector to use connector.name = "csv" # CSV field delimiter connector.csv.delimiter = "|" # The keyspace to connect to schema.keyspace = "myKeyspace" # The table to connect to schema.table = "myTable" # The field-to-column mapping schema.mapping = "0=name, 1=age, 2=email" # The authentication provider for Kerberos driver.auth.provider="DseGSSAPIAuthProvider" driver.auth.principal="principal_name" }
Tip: Additional command line parameters are not required when using this option.- Specify Kerberos options on the command line:
- Use any cached
ticket:
dsbulk load -k ks -t t1 -url ~/data.csv \ --driver.auth.provider DseGSSAPIAuthProvider
- Use a specific principal when more than one ticket is
cached:
dsbulk load -k ks -t t1 -url ~/data.csv \ --driver.auth.provider DseGSSAPIAuthProvider --driver.auth.principal dsbulk_principal_name
- Use any cached
ticket: