Installing DataStax Enterprise 5.1 using the binary tarball

Use these instructions for installing DataStax Enterprise (DSE) 5.1 on Linux-based platforms using a binary tarball.

Some things to know about installing DSE

Prerequisites

Procedure

End User License Agreement (EULA). By downloading this DataStax product, you agree to the terms of the EULA.

In a terminal window:

  1. Verify that a required version of Java is installed:

    java -version

    DataStax recommends the latest build of a Technology Compatibility Kit (TCK) Certified OpenJDK version 8.

    If OpenJDK, the results should look like:

    openjdk version "1.8.0_171"
    OpenJDK Runtime Environment (build 1.8.0_171-8u171-b11-0ubuntu0.16.04.1-b11)
    OpenJDK 64-Bit Server VM (build 25.171-b11, mixed mode)

    If Oracle Java, the results should look like:

    java version "1.8.0_181"
    Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
    Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)

    If not OpenJDK 8 or Oracle Java 8, see Installing the JDK.

  2. When installing from the binary tarball, you can either download the tarball and then extract the files, or use curl.

    • Download and extract the tarball:

      The latest version is 5.1. To view the available versions, see the Release notes.

      1. Download the tarball from Download DataStax Enterprise.

      2. Extract the files:

        tar -xzvf dse-version_number-bin.tar.gz

        For example:

        tar -xzvf dse-5.1-bin.tar.gz
        • Use curl to install the selected version:

          If you choose this method, your password is retained in the shell history. To avoid this security issue, DataStax recommends using curl with the --netrc or --netrc-file

          Download and extract the tarball using curl:

          curl -L https://downloads.datastax.com/enterprise/dse-version_number-bin.tar.gz | tar xz

          For example:

          curl -L https://downloads.datastax.com/enterprise/dse-5.1-bin.tar.gz | tar xz

    The files are downloaded and extracted into the 5.1 directory.

  3. You can use either the default data and logging directory locations or define your locations:

    • To use the default data and logging directory locations, create and change ownership for the following:

      sudo mkdir -p /var/lib/cassandra; sudo chown -R  $USER:$GROUP /var/lib/cassandra &&
      sudo mkdir -p /var/log/cassandra; sudo chown -R  $USER:$GROUP /var/log/cassandra &&
      sudo mkdir -p /var/lib/dsefs; sudo chown -R  $USER:$GROUP /var/lib/dsefs &&
      • To define your own data and logging directory locations:

        1. In the <installation_location>, make the directories for data and logging directories. For example:

          mkdir <installation_location>/dse-data &&
            cd dse-data &&
            mkdir data &&
            mkdir commitlog &&
            mkdir saved_caches &&
            mkdir hints &&
            mkdir cdc_raw
        2. Go the directory containing the cassandra.yaml file:

          Where is the cassandra.yaml file?

          The location of the cassandra.yaml file depends on the type of installation:

          Installation Type Location

          Package installations + Installer-Services installations

          /etc/dse/cassandra/cassandra.yaml

          Tarball installations + Installer-No Services installations

          <installation_location>/resources/cassandra/conf/cassandra.yaml

          cd installation_location/resources/cassandra/conf
        3. Edit the following lines in the cassandra.yaml file:

          data_file_directories: full_path_to_installation_location/dse-data/data
          commitlog_directory: full_path_to_installation_location/dse-data/commitlog
          saved_caches_directory: full_path_to_installation_location/dse-data/saved_caches
          hints_directory: full_path_to_installation_location/dse-data/hints
          cdc_raw_directory: full_path_to_installation_location/cdc/raw
      • Optional: Define your own Spark directories:

        1. Make the directories for the Spark lib and log directories.

        2. Edit the spark-env.sh file to match the locations of your Spark lib and log directories, as described in Configuring Spark nodes. — .Where is the spark-env.sh file? [%collapsible] ===== The default location of the spark-env.sh file depends on the type of installation:

          [cols="2*",options="header",subs="quotes"]
          |===
          |Installation Type
          |Location
          |Package installations +  Installer-Services installations
          |`/etc/dse/spark/spark-env.sh`
          |Tarball installations +  Installer-No Services installations
          |`<installation_location>/resources/spark/conf/spark-env.sh`
          |===
          =====
          --
        3. Make a directory for the DSEFS data directory and set its location in dsefs_options.

  4. You can use either the default data and logging directory locations or define your locations:

    • Default directory locations: If you want to use the default data and logging directory locations, create and change ownership for the following:

      • /var/lib/cassandra

      • /var/log/cassandra

        sudo mkdir -p /var/lib/cassandra; sudo chown -R $USER:$GROUP /var/lib/cassandra &&
          sudo mkdir -p /var/log/cassandra; sudo chown -R $USER:$GROUP /var/log/cassandra
    • Define your own directory locations: If you want to define your own data and logging directory locations:

      1. In the <installation_location>, make the directories for data and logging directories. For example:

        mkdir dse-data; chown -R $USER:$GROUP dse-data &&
          cd dse-data &&
          mkdir commitlog; chown -R $USER:$GROUP commitlog &&
          mkdir saved_caches; chown -R $USER:$GROUP saved_caches &&
          mkdir hints; chown -R $USER:$GROUP hints &&
          mkdir cdc_raw; chown -R $USER:$GROUP cdc_raw
      2. Go the directory containing the cassandra.yaml file:

        cd installation_location/resources/cassandra/conf
      3. Update the following lines in the cassandra.yaml file to match the custom locations:

        data_file_directories:
                - full_path_to_installation_location/dse-data
        commitlog_directory: full_path_to_installation_location/dse-data/commitlog
        saved_caches_directory: full_path_to_installation_location/dse-data/saved_caches
        hints_directory: full_path_to_installation_location/dse-data/hints
        cdc_raw_directory: full_path_to_installation_location/cdc_raw
  5. Optional: If using DSE analytics, you can use either the default Spark data and logging directory locations or define your locations:

    • Default directory locations: If you want to use the default Spark directory locations, create and change ownership for the following:

      • /var/lib/dsefs

      • /var/lib/spark

      • /var/log/spark

        sudo mkdir -p /var/lib/dsefs; sudo chown -R $USER:$GROUP /var/lib/dsefs &&
          sudo mkdir -p /var/lib/spark; sudo chown -R $USER:$GROUP /var/lib/spark &&
          sudo mkdir -p /var/log/spark; sudo chown -R $USER:$GROUP /var/log/spark &&
          sudo mkdir -p /var/lib/spark/rdd; sudo chown -R $USER:$GROUP /var/lib/spark/rdd  &&
          sudo mkdir -p /var/log/spark/master; sudo chown -R $USER:$GROUP /var/log/spark/master  &&
          sudo mkdir -p /var/log/spark/alwayson_sql; sudo chown -R $USER:$GROUP /var/log/spark/alwayson_sql  &&
          sudo mkdir -p /var/lib/spark/worker; sudo chown -R $USER:$GROUP /var/lib/spark/worker
    • Define your own directory locations: If you want to define your own Spark directory locations:

      1. In the <installation_location>, make the directories for data and logging directories. For example:

        mkdir dsefs; chown -R $USER:$GROUP dsefs &&
          mkdir spark; chown -R $USER:$GROUP spark &&
          cd spark &&
          mkdir log; chown -R $USER:$GROUP log &&
          mkdir rdd; chown -R $USER:$GROUP rdd &&
          mkdir worker; chown -R $USER:$GROUP worker &&
          cd log &&
          mkdir worker; chown -R $USER:$GROUP worker &&
          mkdir master; chown -R $USER:$GROUP master &&
          mkdir alwayson_sql; chown -R $USER:$GROUP alwayson_sql
      2. Go the directory containing the spark-env.sh file:

        cd installation_location/resources/spark/conf
      3. Uncomment and update the following lines in the spark-env.sh file:

        export SPARK_WORKER_DIR="full_path_to_installation_location/spark/worker"
        export SPARK_EXECUTOR_DIRS="full_path_to_installation_location/spark/rdd"
        export SPARK_WORKER_LOG_DIR="full_path_to_installation_location/spark/log/worker"
        export SPARK_MASTER_LOG_DIR="full_path_to_installation_location/spark/log/master"
        export ALWAYSON_SQL_LOG_DIR="full_path_to_installation_location/spark/log/alwayson_sql"
      4. Go to the directory containing the dsefs_options file:

        cd installation_location/resources/dse/conf
      5. Uncomment and update the DSEFS directory in dse.yaml:

        Where is the dse.yaml file?

        The location of the dse.yaml file depends on the type of installation:

        Installation Type Location

        Package installations + Installer-Services installations

        /etc/dse/dse.yaml

        Tarball installations + Installer-No Services installations

        <installation_location>/resources/dse/conf/dse.yaml

        work_dir: full_path_to_installation_location/dsefs

    Result

    DataStax Enterprise is ready for additional configuration:

    • For production, be sure to change the cassandra user. Failing to do so is a security risk. See Creating superuser accounts.

    • DataStax Enterprise provides several types of workloads (default is transactional). See startup options for service or stand-alone installations.

    • Next steps below provides links to related tasks and information.

  6. Optional: Single-node cluster installations only:

    1. Start DataStax Enterprise from the installation directory:

      bin/dse cassandra
    2. From the installation directory, verify that DataStax Enterprise is running:

      bin/nodetool status

      Results using vnodes:

      Datacenter: Cassandra
      =====================
      Status=Up/Down
      |/ State=Normal/Leaving/Joining/Moving
      --  Address    Load       Tokens  Owns    Host ID                               Rack
      UN  127.0.0.1  82.43 KB   128     ?       40725dc8-7843-43ae-9c98-7c532b1f517e  rack1

      Results not using vnodes:

      Datacenter: Analytics
      =====================
      Status=Up/Down
      |/ State=Normal/Leaving/Joining/Moving
      --  Address         Load       Owns    Host ID                               Token                 Rack
      UN  172.16.222.136  103.24 KB  ?       3c1d0657-0990-4f78-a3c0-3e0c37fc3a06  1647352612226902707   rack1

Next steps

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2025 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com