Installing DataStax Enterprise 5.1 using the binary tarball

Instructions for installing DataStax Enterprise 5.1 on any supported Linux-based platform.

Use these instructions for installing DataStax Enterprise (DSE) 5.1 on Linux-based platforms using a binary tarball.

Some things to know about installing DSE

  • The latest version of DataStax Enterprise 5.1 is 5.1.14.
  • When DSE is installed, it creates a cassandra user in the database. Do not use the cassandra user in production. See Creating superuser accounts.
  • When installed from the binary tarball, DataStax Enterprise runs as a stand-alone process.
  • This procedure installs DataStax Enterprise 5.1 and the developer related tools: Javadoc, DataStax Enterprise demos, DataStax Studio, and the DSE Graph Loader.

    It does not install OpsCenter, DataStax Agent, Studio, or Graph Loader. After installing, you must configure and start DataStax Enterprise.

  • After installing, you must configure and start DataStax Enterprise.

Prerequisites

Hardware requirements

See Recommended production settings and the DataStax Enterprise Reference Architecture white paper.

Procedure

Important: End User License Agreement (EULA). By downloading DataStax products, you confirm that you agree to the processing of information as described in the DataStax website privacy policy and agree to the website terms of use.

In a terminal window:

  1. Verify that a required version of Java is installed:
    java -version
    Note: DataStax recommends the latest build of a Technology Compatibility Kit (TCK) Certified OpenJDK version 8.

    If not OpenJDK 8 or Oracle Java 8, see Installing supporting software.

    Important:
    • Although Oracle JRE/JDK 8 is supported, DataStax does more extensive testing on OpenJDK 8 starting with DSE 5.1.11. This change is due to the end of public updates for Oracle JRE/JDK 8. Java 9 is not supported.
    • Package management tools do not install OpenJDK 8 or Oracle Java.
  2. When installing from the binary tarball, you can either download the tarball and then extract the files, or use curl.
    • Download and extract the tarball specifying the version:
      Note: The latest version is 5.1.14. To view the available versions, see the Release notes.
      1. Download the tarball from Download DataStax Enterprise.
      2. Extract the files:
        tar -xzvf dse-5.1.14-bin.tar.gz
    • Using curl to download and extract the tarball:
      curl -L https://downloads.datastax.com/enterprise/dse-5.1.14-bin.tar.gz | tar xz

    The files are downloaded and extracted into the dse-version directory.

  3. You can use either the default data and logging directory locations or define your locations:
    • To use the default data and logging directory locations, create and change ownership for the following:
      sudo mkdir -p /var/lib/cassandra; sudo chown -R  $USER:$GROUP /var/lib/cassandra &&
      sudo mkdir -p /var/log/cassandra; sudo chown -R  $USER:$GROUP /var/log/cassandra &&
      sudo mkdir -p /var/lib/dsefs; sudo chown -R  $USER:$GROUP /var/lib/dsefs &&
    • To define your own data and logging directory locations:
      1. In the installation_location, make the directories for data and logging directories. For example:
        mkdir installation_location/dse-data &&
          cd dse-data &&
          mkdir commitlog &&
          mkdir saved_caches &&
          mkdir hints &&
          mkdir cdc_raw
      2. Go the directory containing the cassandra.yaml file:
        cd installation_location/resources/cassandra/conf 
      3. Edit the following lines in the cassandra.yaml file:
        data_file_directories: full_path_to_installation_location/dse-data
        commitlog_directory: full_path_to_installation_location/dse-data/commitlog
        saved_caches_directory: full_path_to_installation_location/dse-data/saved_caches
        hints_directory: full_path_to_installation_location/dse-data/hints
        cdc_raw_directory: full_path_to_installation_location/cdc/raw
    • Optional: Define your own Spark directories:
      1. Make the directories for the Spark lib and log directories.
      2. Edit the spark-env.sh file to match the locations of your Spark lib and log directories, as described in Configuring Spark nodes.
      3. Make a directory for the DSEFS data directory and set its location in dsefs_options.

      spark-env.sh

      The default location of the spark-env.sh file depends on the type of installation:

      Package installations
      Installer-Services installations

      /etc/dse/spark/spark-env.sh

      Tarball installations
      Installer-No Services installations

      installation_location/resources/spark/conf/spark-env.sh

    DataStax Enterprise is ready for additional configuration.

  4. You can use either the default data and logging directory locations or define your locations:
    • Default directory locations: If you want to use the default data and logging directory locations, create and change ownership for the following:
      • /var/lib/cassandra
      • /var/log/cassandra
      sudo mkdir -p /var/lib/cassandra; sudo chown -R $USER:$GROUP /var/lib/cassandra &&
        sudo mkdir -p /var/log/cassandra; sudo chown -R $USER:$GROUP /var/log/cassandra
    • Define your own directory locations: If you want to define your own data and logging directory locations:
      1. In the installation_location, make the directories for data and logging directories. For example:
        mkdir dse-data; chown -R $USER:$GROUP dse-data &&
          cd dse-data && 
          mkdir commitlog; chown -R $USER:$GROUP commitlog && 
          mkdir saved_caches; chown -R $USER:$GROUP saved_caches &&
          mkdir hints; chown -R $USER:$GROUP hints && 
          mkdir cdc_raw; chown -R $USER:$GROUP cdc_raw
      2. Go the directory containing the cassandra.yaml file:
        cd installation_location/resources/cassandra/conf
      3. Update the following lines in the cassandra.yaml file to match the custom locations:
        data_file_directories: 
                - full_path_to_installation_location/dse-data
        commitlog_directory: full_path_to_installation_location/dse-data/commitlog
        saved_caches_directory: full_path_to_installation_location/dse-data/saved_caches
        hints_directory: full_path_to_installation_location/dse-data/hints
        cdc_raw_directory: full_path_to_installation_location/cdc_raw
  5. Optional: If using DSE analytics, you can use either the default Spark data and logging directory locations or define your locations:
    • Default directory locations: If you want to use the default Spark directory locations, create and change ownership for the following:
      • /var/lib/dsefs
      • /var/lib/spark
      • /var/log/spark
      sudo mkdir -p /var/lib/dsefs; sudo chown -R $USER:$GROUP /var/lib/dsefs && 
        sudo mkdir -p /var/lib/spark; sudo chown -R $USER:$GROUP /var/lib/spark && 
        sudo mkdir -p /var/log/spark; sudo chown -R $USER:$GROUP /var/log/spark &&
        sudo mkdir -p /var/lib/spark/rdd; sudo chown -R $USER:$GROUP /var/lib/spark/rdd  &&
        sudo mkdir -p /var/log/spark/master; sudo chown -R $USER:$GROUP /var/log/spark/master  &&
        sudo mkdir -p /var/log/spark/alwayson_sql; sudo chown -R $USER:$GROUP /var/log/spark/alwayson_sql  &&
        sudo mkdir -p /var/lib/spark/worker; sudo chown -R $USER:$GROUP /var/lib/spark/worker
    • Define your own directory locations: If you want to define your own Spark directory locations:
      1. In the installation_location, make the directories for data and logging directories. For example:
        mkdir dsefs; chown -R $USER:$GROUP dsefs &&
          mkdir spark; chown -R $USER:$GROUP spark &&  
          cd spark && 
          mkdir log; chown -R $USER:$GROUP log &&
          mkdir rdd; chown -R $USER:$GROUP rdd && 
          mkdir worker; chown -R $USER:$GROUP worker &&
          cd log &&
          mkdir worker; chown -R $USER:$GROUP worker &&
          mkdir master; chown -R $USER:$GROUP master &&
          mkdir alwayson_sql; chown -R $USER:$GROUP alwayson_sql
      2. Go the directory containing the spark-env.sh file:
        cd installation_location/resources/spark/conf
      3. Uncomment and update the following lines in the spark-env.sh file:
        export SPARK_WORKER_DIR="full_path_to_installation_location/spark/worker"
        export SPARK_EXECUTOR_DIRS="full_path_to_installation_location/spark/rdd"
        export SPARK_WORKER_LOG_DIR="full_path_to_installation_location/spark/log/worker"
        export SPARK_MASTER_LOG_DIR="full_path_to_installation_location/spark/log/master"
        export ALWAYSON_SQL_LOG_DIR="full_path_to_installation_location/spark/log/alwayson_sql"
      4. Go to the directory containing the dsefs_options file:
        cd installation_location/resources/dse/conf
      5. Uncomment and update the DSEFS directory in dse.yaml:
        work_dir: full_path_to_installation_location/dsefs

    DataStax Enterprise is ready for additional configuration.

  6. Optional: Single-node cluster installations only:
    1. Start DataStax Enterprise from the installation directory:
      bin/dse cassandra
      Note: For other start options, see Starting DataStax Enterprise as a stand-alone process.
    2. From the installation directory, verify that DataStax Enterprise is running:
      bin/nodetool status
      Results using vnodes:
      Datacenter: Cassandra
      =====================
      Status=Up/Down
      |/ State=Normal/Leaving/Joining/Moving
      --  Address    Load       Tokens  Owns    Host ID                               Rack
      UN  127.0.0.1  82.43 KB   128     ?       40725dc8-7843-43ae-9c98-7c532b1f517e  rack1
      Results not using vnodes:
      Datacenter: Analytics
      =====================
      Status=Up/Down
      |/ State=Normal/Leaving/Joining/Moving
      --  Address         Load       Owns    Host ID                               Token                 Rack
      UN  172.16.222.136  103.24 KB  ?       3c1d0657-0990-4f78-a3c0-3e0c37fc3a06  1647352612226902707   rack1

What's next

cassandra.yaml

The location of the cassandra.yaml file depends on the type of installation:

Package installations
Installer-Services installations

/etc/dse/cassandra/cassandra.yaml

Tarball installations
Installer-No Services installations

installation_location/resources/cassandra/conf/cassandra.yaml

dse.yaml

The location of the dse.yaml file depends on the type of installation:

Package installations
Installer-Services installations

/etc/dse/dse.yaml

Tarball installations
Installer-No Services installations

installation_location/resources/dse/conf/dse.yaml