DSEFS authentication
DSEFS works with secured DataStax Enterprise clusters.
spark-defaults.conf
The default location of the spark-defaults.conf file depends on the type of installation:Package installations | /etc/dse/spark/spark-defaults.conf |
Tarball installations | installation_location/resources/spark/conf/spark-defaults.conf |
DSEFS authentication with secured clusters
Authentication is required only when it is enabled in the cluster. DSEFS on secured clusters requires the DseAuthenticator, see . Authentication is off by default.
DSEFS supports authentication using DSE Unified Authentication, and supports all authentication schemes supported by DSE Authenticator, including Kerberos.
DSEFS authentication can secure client to server communication.
Spark applications
- Set with the
dse spark-submit
command using one of the credential options described in . - Programmatically set the user credentials in the Spark configuration object before the
SparkContext
is created:conf.set("spark.hadoop.com.datastax.bdp.fs.client.authentication.basic.username", <user>) conf.set("spark.hadoop.com.datastax.bdp.fs.client.authentication.basic.password", <pass>)
If a Kerberos authentication token is in use, you do not need to set any properties in the context object. If you need to explicitly set the token, set the
spark.hadoop.cassandra.auth.token
property. - When running the Spark Shell, where the
SparkContext
is created at startup, set the properties in the Hadoop configuration object:
Note the absence of thesc.hadoopConfiguration.set("com.datastax.bdp.fs.client.authentication.basic.username", <user>) sc.hadoopConfiguration.set("com.datastax.bdp.fs.client.authentication.basic.password", <pass>)
spark.hadoop
prefix. - When running a Spark application or the Spark Shell, provide properties in the
spark-defaults.conf configuration
file:
<property> <name>com.datastax.bdp.fs.client.authentication.basic.username</name> <value>username</value> </property> <property> <name>com.datastax.bdp.fs.client.authentication.basic.password</name> <value>password</value> </property>
Optional: If you want to use this method, but do not have privileges to write to core-default.xml, copy this file to any location path and set the environment variable to point to the file with:export HADOOP2_CONF_DIR=path
DSEFS shell
Providing authentication credentials while using the DSEFS shell is as easy as in other DSE
tools. The DSEFS shell supports different authentication methods listed below in priority
order. When more than one method can be used, the one with higher priority is chosen. For
example when the DSE_TOKEN
environment variable is set and the DSEFS shell
is started with a username and password set as environment variables in the
$HOME/.dserc file, the provided username and password is used for
authentication as it has higher priority.
- Specifying a username and password.
export DSE_USERNAME=username && export DSE_PASSWORD=password
dse fs 'mkdir /dir1'
- Using a Kerberos delegation token.
See dse client-tool cassandra for further
information.
export DSE_TOKEN=`dse -u token_user -p password client-tool cassandra generate-token`
dse fs 'mkdir /dir1'
- Using a cached Kerberos ticket after
authenticating using a tool like
kinit
.kinit username
dse fs 'mkdir /dir1'
- Using a Kerberos keytab file and a login
configuration file.
If the configuration file is in a non-default location, specify the location using the
java.security.auth.login.config
property in theDSEFS_SHELL_OPTS
variable:DSEFS_SHELL_OPTS="-Djava.security.auth.login.config=path_to_login_config_file" dse fs
DSEFS REST interface
The DSEFS REST interface supports Kerberos authentication using SPNEGO and Kerberos delegation token authentication.
kinit ...
curl -v --negotiate -u : "http://localhost:5598/webhdfs/v1/?op=LISTSTATUS"
- Obtain a delegation token using one of these methods:
- dse client-tool
For example, to generate a delegation token with the current user as the token renewer
dse client-tool cassandra --generate-token
- curl
curl -v --negotiate -u : "http://10.200.177.136:5598/webhdfs/v1/?op=GETDELEGATIONTOKEN" # uses Spnego to obtain the token
- dse client-tool
- Use the delegation
token:
curl -v "http://localhost:5598/webhdfs/v1/?op=LISTSTATUS&delegation=delegation_token"