Installing DataStax Enterprise 6.0 using the binary tarball

Instructions for installing DataStax Enterprise (DSE) 6.0 on any supported Linux-based platform.

Use these instructions for installing with root permissions on Linux-based platform using a binary tarball.

Some things to know about installing DSE

  • The latest version of DataStax Enterprise 6.0 is 6.0.4.
  • When installed from the binary tarball, DataStax Enterprise runs as a stand-alone process.
  • When DSE is installed, it creates a cassandra user in the database and runs as this user. Do not use the cassandra user in production. Using the cassandra user is a security risk. See Adding a superuser login.

cassandra.yaml

The location of the cassandra.yaml file depends on the type of installation:
Package installations /etc/dse/cassandra/cassandra.yaml
Tarball installations installation_location/resources/cassandra/conf/cassandra.yaml

spark-env.sh

The default location of the spark-env.sh file depends on the type of installation:
Package installations /etc/dse/spark/spark-env.sh
Tarball installations installation_location/resources/spark/conf/spark-env.sh

Prerequisites

Table 1. Hardware requirements
Requirement Minimum Production
CPUs 2 16
Memory 8 GB 24 GB
Data directory 20 GB 200 GB
Commit log directory 20 GB 200 GB
Saved caches directory 20 GB 200 GB
Logs directory 20 GB 200 GB
Also see Recommended production settings and the DataStax Enterprise Reference Architecture white paper.

Procedure

In a terminal window:

  1. Verify that a required version of Java is installed:
    java -version
    If OpenJDK, the results should look like:
    openjdk version "1.8.0_171"
    OpenJDK Runtime Environment (build 1.8.0_171-8u171-b11-0ubuntu0.16.04.1-b11)
    OpenJDK 64-Bit Server VM (build 25.171-b11, mixed mode)
    If Oracle Java, the results should look like:
    java version "1.8.0_181"
    Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
    Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)

    If not OpenJDK 8 or Oracle Java 8, see Installing supporting software on DataStax Enterprise.

  2. Install the libaio package. For example:
    • RHEL platforms:
      sudo yum install libaio
    • Debian platforms:
      sudo apt-get install libaio1

Installing the latest version (6.0.4)

To download a specific version of DSE 6.0.x, go to 4.

  1. When installing from the binary tarball, you can either download the tarball and then extract the files, or use curl.
    • Download and extract the latest version tarball (6.0.4):
      1. Using your DataStax Academy registration email address and Downloads Key or Profile Name and password, download the tarball from dse-6.0.4-bin.tar.gz.
      2. Extract the files:
        tar -xzvf dse-6.0.4-bin.tar.gz
    • Use curl to install the latest version (6.0.4):
      CAUTION: If you choose this method, your password is retained in the shell history. To avoid this security issue, DataStax recommends using curl with the --netrc or --netrc-file option.
      1. Download and extract the tarball using curl:
        curl --user DSA_email_address:downloads_key -L \
        https://downloads.datastax.com/enterprise/dse-6.0.4-bin.tar.gz | tar xz

        where DSA_email_address and downloads_key are your DataStax Academy email address and My Downloads Key. Depending on your environment, you might need to replace @ in your email address with %40 and escape any character in your password that is used in your operating system's command line. Examples: \! and \|.

        For backward compatibility, you can use your DataStax Academy Profile Name and password instead of your email address and Downloads Key.

      The files are downloaded and extracted into the 6.0 directory.

Installing specific 6.0.x versions

  1. When installing from the binary tarball, you can either download the tarball and then extract the files, or use curl.
    • Download and extract specific 6.0.x tarballs into the current directory:
      1. Using your Your DataStax Academy registration Profile Name and Downloads Key or email address and password, download the tarball from Download DataStax Enterprise.
      2. Extract the files:
        tar -xzvf dse-version_number-bin.tar.gz
    • Use curl to install specific 6.0.x versions:
      CAUTION: If you choose this method, your password is retained in the shell history. To avoid this security issue, DataStax recommends using curl with the --netrc or --netrc-file option.
      Download and extract:
      curl --user DSA_email_address:downloads_key -L \
      https://downloads.datastax.com/enterprise/dse-version_number-bin.tar.gz | tar xz

      where DSA_email_address and downloads_key are your DataStax Academy email address and My Downloads Key. Depending on your environment, you might need to replace @ in your email address with %40 and escape any character in your password that is used in your operating system's command line. Examples: \! and \|.

      For backward compatibility, you can use your DataStax Academy Profile Name and password instead of your email address and Downloads Key.

      The files are downloaded and extracted into the 6.0 directory.

  2. You can use either the default data and logging directory locations or define your locations:
    • Default directory locations: If you want to use the default data and logging directory locations, create and change ownership for the following:
      • /var/lib/cassandra
      • /var/log/cassandra (includes audit directory, debug.log, gremlin.log, solrvalidation.log, system.log)
      • /var/lib/dsefs
      • /var/lib/spark
      • /var/log/spark
      sudo mkdir -p /var/lib/cassandra; sudo chown -R  $USER:$GROUP /var/lib/cassandra &&
        sudo mkdir -p /var/log/cassandra; sudo chown -R  $USER:$GROUP /var/log/cassandra &&
        sudo mkdir -p /var/lib/dsefs; sudo chown -R  $USER:$GROUP /var/lib/dsefs && 
        sudo mkdir -p /var/lib/spark; sudo chown -R  $USER:$GROUP /var/lib/spark && 
        sudo mkdir -p /var/log/spark; sudo chown -R  $USER:$GROUP /var/log/spark &&
        sudo mkdir -p /var/lib/spark/rdd; sudo chown -R  $USER:$GROUP /var/lib/spark/rdd  &&
        sudo mkdir -p /var/lib/spark/worker; sudo chown -R  $USER:$GROUP /var/lib/spark/worker
    • Define your own directory locations: If you want to define your own data and logging directory locations:
      1. In the installation_location, make the directories for data and logging directories. For example:
        mkdir dse-data &&
          cd dse-data && 
          mkdir commitlog && 
          mkdir saved_caches &&
          mkdir hints && 
          mkdir cdc_raw
      2. Go the directory containing the cassandra.yaml file:
        cd installation_location/resources/cassandra/conf
      3. Edit the following lines in the cassandra.yaml file:
        data_file_directories: full_path_to_installation_location/dse-data
        commitlog_directory: full_path_to_installation_location/dse-data/commitlog
        saved_caches_directory: full_path_to_installation_location/dse-data/saved_caches
        hints_directory: full_path_to_installation_location/dse-data/hints
        cdc_raw_directory: full_path_to_installation_location/cdc_raw
  3. Optional: To define your own Spark directories:
    1. Make the directories for the Spark lib and log directories.
    2. Edit the spark-env.sh file to match the locations of your Spark lib and log directories, as described in Configuring Spark nodes.
    3. Make a directory for the DSEFS data directory and set its location in dsefs_options.
    DataStax Enterprise is ready for additional configuration. See What's next.
  4. Optional: Single-node cluster installations only:
    1. Start DataStax Enterprise from the installation directory:
      bin/dse cassandra
      where the installation directory is either:
      • /usr/share/dse
      • DataStax Enterprise installation directory
      Note: For other start options, see Starting DataStax Enterprise as a stand-alone process.
    2. Verify that DataStax Enterprise is running from the installation directory:
      bin/nodetool status
      Results using vnodes:
      Datacenter: Cassandra
      =====================
      Status=Up/Down
      |/ State=Normal/Leaving/Joining/Moving
      --  Address    Load       Tokens  Owns    Host ID                               Rack
      UN  127.0.0.1  82.43 KB   128     ?       40725dc8-7843-43ae-9c98-7c532b1f517e  rack1
      Results not using vnodes:
      Datacenter: Analytics
      =====================
      Status=Up/Down
      |/ State=Normal/Leaving/Joining/Moving
      --  Address         Load       Owns    Host ID                               Token                 Rack
      UN  172.16.222.136  103.24 KB  ?       3c1d0657-0990-4f78-a3c0-3e0c37fc3a06  1647352612226902707   rack1

What's next