Installing DataStax Enterprise 6.0 using the binary tarball

Instructions for installing DataStax Enterprise (DSE) 6.0 on any supported Linux-based platform.

Use these instructions for installing DataStax Enterprise (DSE) on supported Linux-based platforms using a binary tarball.

Some things to know about installing DataStax Enterprise:

  • The latest version of DataStax Enterprise 6.0 is 6.0.x.
  • When installed from the binary tarball:
    • DataStax Enterprise runs as a stand-alone process.
    • You can install DSE with or without root permissions.
Warning: When DSE is installed, it creates a cassandra user in the database. Do not use the cassandra user in production. Failing to do so is a security risk. See .

spark-env.sh

The default location of the spark-env.sh file depends on the type of installation:
Package installations /etc/dse/spark/spark-env.sh
Tarball installations installation_location/resources/spark/conf/spark-env.sh

cassandra.yaml

The location of the cassandra.yaml file depends on the type of installation:
Package installations /etc/dse/cassandra/cassandra.yaml
Tarball installations installation_location/resources/cassandra/conf/cassandra.yaml

Prerequisites

Hardware requirements

See .

Important: End User License Agreement (EULA). By downloading this DataStax product, you agree to the terms of the EULA.

In a terminal window:

  1. Verify that a required version of Java is installed:
    java -version
    Note: DataStax recommends the latest build of a Technology Compatibility Kit (TCK) Certified OpenJDK version 8.
    If OpenJDK, the results should look like:
    openjdk version "1.8.0_171"
    OpenJDK Runtime Environment (build 1.8.0_171-8u171-b11-0ubuntu0.16.04.1-b11)
    OpenJDK 64-Bit Server VM (build 25.171-b11, mixed mode)
    If Oracle Java, the results should look like:
    java version "1.8.0_181"
    Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
    Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)

    If not OpenJDK 8 or Oracle Java 8, see Installing the JDK..

  2. Install the libaio package. For example:
    • RHEL platforms:
      sudo yum install libaio
    • Debian platforms:
      sudo apt-get install libaio1
  3. When installing from the binary tarball, you can either download the tarball and then extract the files, or use curl.
    • Download and extract the tarball:
      1. Download the tarball from Download DataStax Enterprise.
      2. Extract the files:
        tar -xzvf dse-version_number-bin.tar.gz
        For example:
        tar -xzvf dse-6.0.x-bin.tar.gz
    • Use curl to install the selected version:
      CAUTION: If you choose this method, your password is retained in the shell history. To avoid this security issue, DataStax recommends using curl with the --netrc or --netrc-file option.
      1. Download and extract the tarball using curl:
        curl -L https://downloads.datastax.com/enterprise/dse-version_number-bin.tar.gz | tar xz
        For example:
        curl -L https://downloads.datastax.com/enterprise/dse-6.0.x-bin.tar.gz | tar xz

      The files are downloaded and extracted into the 6.0 directory.

  4. You can use either the default data and logging directory locations or define your locations:
    • Default directory locations: If you want to use the default data and logging directory locations, create and change ownership for the following:
      • /var/lib/cassandra
      • /var/log/cassandra
      sudo mkdir -p /var/lib/cassandra; sudo chown -R $USER:$GROUP /var/lib/cassandra &&
        sudo mkdir -p /var/log/cassandra; sudo chown -R $USER:$GROUP /var/log/cassandra &&
        sudo mkdir -p /var/lib/dsefs; sudo chown -R  $USER:$GROUP /var/lib/dsefs && 
    • Define your own directory locations: If you want to define your own data and logging directory locations:
      1. In the installation_location, make the directories for data and logging directories. For example:
        mkdir dse-data &&
          cd dse-data && 
          mkdir data &&
          mkdir commitlog && 
          mkdir saved_caches &&
          mkdir hints && 
          mkdir cdc_raw
      2. Go the directory containing the cassandra.yaml file:
        cd installation_location/resources/cassandra/conf
      3. Update the following lines in the cassandra.yaml file to match the custom locations:
        data_file_directories: 
          - full_path_to_installation_location/dse-data/data
          commitlog_directory: full_path_to_installation_location/dse-data/commitlog
          saved_caches_directory: full_path_to_installation_location/dse-data/saved_caches
          hints_directory: full_path_to_installation_location/dse-data/hints
          cdc_raw_directory: full_path_to_installation_location/cdc_raw

    Result

    DataStax Enterprise is ready for additional configuration:
    • For production, be sure to change the cassandra user. Failing to do so is a security risk. See .
    • DataStax Enterprise provides several types of workloads (default is transactional). See startup options for service or stand-alone installations.
    • What's next below provides links to related tasks and information.
  5. Optional: If using DSE analytics, you can use either the default Spark data and logging directory locations or define your locations:
    • Default directory locations: If you want to use the default Spark directory locations, create and change ownership for the following:
      • /var/lib/dsefs
      • /var/lib/spark
      • /var/log/spark
      sudo mkdir -p /var/lib/dsefs; sudo chown -R $USER:$GROUP /var/lib/dsefs && 
        sudo mkdir -p /var/lib/spark; sudo chown -R $USER:$GROUP /var/lib/spark && 
        sudo mkdir -p /var/log/spark; sudo chown -R $USER:$GROUP /var/log/spark &&
        sudo mkdir -p /var/lib/spark/rdd; sudo chown -R $USER:$GROUP /var/lib/spark/rdd  &&
        sudo mkdir -p /var/log/spark/master; sudo chown -R $USER:$GROUP /var/log/spark/master  &&
        sudo mkdir -p /var/log/spark/alwayson_sql; sudo chown -R $USER:$GROUP /var/log/spark/alwayson_sql  &&
        sudo mkdir -p /var/lib/spark/worker; sudo chown -R $USER:$GROUP /var/lib/spark/worker
    • Define your own directory locations: If you want to define your own Spark directory locations:
      1. In the installation_location, make the directories for data and logging directories. For example:
        mkdir dsefs &&
          mkdir spark &&  
          cd spark && 
          mkdir log &&
          mkdir rdd && 
          mkdir worker &&
          cd log &&
          mkdir worker &&
          mkdir master &&
          mkdir alwayson_sql
      2. Go the directory containing the spark-env.sh file:
        cd installation_location/resources/spark/conf
      3. Uncomment and update the following lines in the spark-env.sh file:
        export SPARK_WORKER_DIR="full_path_to_installation_location/spark/worker"
          export SPARK_EXECUTOR_DIRS="full_path_to_installation_location/spark/rdd"
          export SPARK_WORKER_LOG_DIR="full_path_to_installation_location/spark/log/worker"
          export SPARK_MASTER_LOG_DIR="full_path_to_installation_location/spark/log/master"
          export ALWAYSON_SQL_LOG_DIR="full_path_to_installation_location/spark/log/alwayson_sql"
      4. Go to the directory containing the dsefs_options file:
        cd installation_location/resources/dse/conf
      5. Uncomment and update the DSEFS directory in dse.yaml:
        work_dir: full_path_to_installation_location/dsefs
    DataStax Enterprise is ready for additional configuration. See What's next.
  6. Optional: Single-node cluster installations only:
    1. Start DataStax Enterprise from the installation directory:
      bin/dse cassandra

      where the installation directory is the directory where you installed DSE.

      Note: For other start options, see Starting DataStax Enterprise as a stand-alone process.
    2. Verify that DataStax Enterprise is running from the installation directory:
      bin/nodetool status
      Results using vnodes:
      Datacenter: Cassandra
      =====================
      Status=Up/Down
      |/ State=Normal/Leaving/Joining/Moving
      --  Address    Load       Tokens  Owns    Host ID                               Rack
      UN  127.0.0.1  82.43 KB   128     ?       40725dc8-7843-43ae-9c98-7c532b1f517e  rack1
      Results not using vnodes:
      Datacenter: Analytics
      =====================
      Status=Up/Down
      |/ State=Normal/Leaving/Joining/Moving
      --  Address         Load       Owns    Host ID                               Token                 Rack
      UN  172.16.222.136  103.24 KB  ?       3c1d0657-0990-4f78-a3c0-3e0c37fc3a06  1647352612226902707   rack1

What's next