Installing the DataStax Distribution of Apache Cassandra using the binary tarball

Instructions for installing DataStax Distribution of Apache Cassandra on any supported Linux-based platform.

Use these instructions for installing the DataStax Distribution of Apache Cassandra™ (DDAC) on any Linux-based platform using a binary tarball.

Some things to know about DataStax Distribution of Apache Cassandra

  • DataStax Distribution of Apache Cassandra is based on DataStax Enterprise (DSE) 5.1 and Apache Cassandra 3.11.
  • DataStax Distribution of Apache Cassandra runs as a stand-alone process.
Warning: When DataStax Distribution of Apache Cassandra is installed, it creates a cassandra user in the database. Do not use the cassandra user in production. Failing to do so is a security risk. See Creating superuser accounts.

Prerequisites

  • A supported platform. DataStax Distribution of Apache Cassandra supports the same platforms as listed in the DSE 5.1 (x86_64) column.
  • Your DataStax Academy registration email address and Downloads Key or Profile Name and password.
  • Configure your operating system to use the latest version of Java 8:
  • RedHat-compatible distributions require EPEL (Extra Packages for Enterprise Linux).
  • Python 2.7.x

    For older RHEL distributions, see Installing Python 2.7 on older RHEL-based package installations.

Table 1. Hardware requirements
Requirement Minimum Production
CPUs 2 16
Memory 8 GB 24 GB
Data directory 20 GB 200 GB
Commit log directory 20 GB 200 GB
Saved caches directory 20 GB 200 GB
Logs directory 20 GB 200 GB

Procedure

In a terminal window:

  1. Verify that a required version of Java is installed:
    java -version
    If OpenJDK, the results should look like:
    openjdk version "1.8.0_171"
    OpenJDK Runtime Environment (build 1.8.0_171-8u171-b11-0ubuntu0.16.04.1-b11)
    OpenJDK 64-Bit Server VM (build 25.171-b11, mixed mode)
    If Oracle Java, the results should look like:
    java version "1.8.0_181"
    Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
    Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)

    If not OpenJDK 8 or Oracle Java 8, see Installing supporting software.

  2. Download the tarball using your DataStax Academy registration email address and Downloads Key or Profile Name and password.
  3. Extract the files:
    tar -xzvf ddac-5.1.11-bin.tar.gz
  4. Define the data and logging directory locations, use one of the following options:
    • Default directory locations - No action is required. When the directory locations are excluded or commented out in the cassandra.yaml, Cassandra uses the default locations:
      • data_file_directories: installation_location/data/data
      • commitlog_directory: installation_location/data/commitlog
      • saved_caches_directory: installation_location/data/saved_caches
      • hints_directory: installation_location/data/hints
      • cdc_raw_directoryinstallation_location/data/cdc_raw
    • Recommended directory locations - Most production deployments store data and logs in /var/lib/cassandra. To use the recommended location:
      1. Create and change owners for the /var/lib/cassandra directory:
        sudo mkdir -p /var/lib/cassandra; sudo chown -R  \
          $USER:$GROUP /var/lib/cassandra
        Tip: Ensure that the account that runs cassandra has write access to directory for data and logs. The subdirectories are automatically created.
      2. Go the directory containing the cassandra.yaml file:
        cd installation_location/conf
      3. Uncomment the following lines in the cassandra.yaml file:
        data_file_directories:
            - /var/lib/cassandra/data
        commitlog_directory: /var/lib/cassandra/commitlog
        saved_caches_directory: /var/lib/cassandra/saved_caches
        hints_directory: /var/lib/cassandra/data/hints
        cdc_raw_directory: /var/lib/cassandra/cdc_raw
        Tip: When using a minimal YAML, add the options as shown above.
    • Custom location - To define custom data and logging directory locations:
      1. Create the directories for data and logging directories. For example:
        mkdir ~/scratch/test/data && 
        mkdir ~/scratch/test/commitlog &&
        mkdir ~/scratch/test/saved_caches && 
        mkdir ~/scratch/test/data/hints &&
        mkdir ~/scratch/test/cdc_raw
        Tip: Ensure that the account that runs cassandra has write access to directory for data and logs.
      2. Go the directory containing the cassandra.yaml file:
        cd installation_location/conf
      3. Edit the following lines in the cassandra.yaml file:
        data_file_directories:
            - /Users/janedoe/scratch/test/data
        commitlog_directory: /Users/janedoe/scratch/test/commitlog
        saved_caches_directory: /Users/janedoe/scratch/test/saved_caches
        hints_directory: /Users/janedoe/scratch/test/data/hints
        cdc_raw_directory: /Users/janedoe/scratch/test/cdc_raw
  5. Optional: Single-node cluster installations only.
    1. Start DataStax Distribution of Apache Cassandra from the installation directory:
      bin/cassandra
    2. From the installation directory, verify that DataStax Distribution of Apache Cassandra is running:
      bin/nodetool status
      Results using vnodes:
      Datacenter: Cassandra
      =====================
      Status=Up/Down
      |/ State=Normal/Leaving/Joining/Moving
      --  Address    Load       Tokens  Owns    Host ID                               Rack
      UN  127.0.0.1  82.43 KB   128     ?       40725dc8-7843-43ae-9c98-7c532b1f517e  rack1
      Results not using vnodes:
      Datacenter: Cassandra
      =====================
      Status=Up/Down
      |/ State=Normal/Leaving/Joining/Moving
      --  Address         Load       Owns    Host ID                               Token                 Rack
      UN  172.16.222.136  103.24 KB  ?       3c1d0657-0990-4f78-a3c0-3e0c37fc3a06  1647352612226902707   rack1