DSEFS authentication

DSEFS works with secured DataStax Enterprise clusters.

For related SSL details, see Enabling SSL encryption for DSEFS.

DSEFS authentication with secured clusters

Authentication is required only when it is enabled in the cluster. DSEFS on secured clusters requires the DseAuthenticator, see Configuring DSE Unified Authentication. Authentication is off by default.

DSEFS authentication with the DSE Unified Authentication supports authentication using any combination of DSE Unified Authentication and LDAP pass-through authentication. DSEFS does not support Kerberos directly, but allows users to authenticate with a delegation token using the Digest MD5 authentication protocol. A delegation token is always generated for Spark applications when they are used with Kerberos, and Spark applications are configured to use that token to authenticate the DSEFS client with DSE.

DSEFS authentication applies only to communication between the DSEFS client and the DSEFS server.

Spark applications

For Spark applications, provide authentication credentials in one of these ways:

  • Set with the dse spark-submit command:

    $ dse -u username -p password spark-submit

    Or preferably use the equivalent environment variables for the username and password.

  • Programmatically set the user credentials in the Spark configuration object before the SparkContext is created:

    conf.set("spark.hadoop.com.datastax.bdp.fs.client.authentication.basic.username", <user>)
    conf.set("spark.hadoop.com.datastax.bdp.fs.client.authentication.basic.password", <pass>)

    If a Kerberos authentication token is in use, you do not need to set any properties in the context object. If you need to explicitly set the token, set the spark.hadoop.cassandra.auth.token property.

  • When running the Spark Shell, where the SparkContext is created at startup, set the properties in the Hadoop configuration object:

    sc.hadoopConfiguration.set("com.datastax.bdp.fs.client.authentication", "basic")
    sc.hadoopConfiguration.set("com.datastax.bdp.fs.client.authentication.basic.username", <user>)
    sc.hadoopConfiguration.set("com.datastax.bdp.fs.client.authentication.basic.password", <pass>)

    Note the absence of the spark.hadoop prefix.

  • When running a Spark application or the Spark Shell, provide properties in the Hadoop XML configuration file. For example, in /usr/local/lib/dse/resources/hadoop2-client/conf/core-default.xml:

    <property>
        <name>com.datastax.bdp.fs.client.authentication</name>
        <value>basic</value>
    </property>
    <property>
        <name>com.datastax.bdp.fs.client.authentication.basic.username</name>
        <value>username</value>
    </property>
    <property>
        <name>com.datastax.bdp.fs.client.authentication.basic.password</name>
        <value>password</value>
    </property>

    Optional: If you want to use this method, but do not have privileges to write to dse-core-default.xml, copy this file to any location path and set the environment variable to point to the file with:

    export HADOOP2_CONF_DIR=path

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com