Installing DataStax Enterprise 4.8 using the binary tarball

Install DataStax Enterprise on any Linux-based platform.

For other product installations, see Installing OpsCenter and Installing DevCenter.

When installed from the binary tarball, DataStax Enterprise runs as a stand-alone process.

Important: DataStax Enterprise 4.8 uses Cassandra 2.1 and CQL3.1.

Prerequisites

  • All Linux platforms:
    • Be sure your platform is supported.
    • DataStax Academy registration email address and password.
    • Latest version of Oracle Java SE Runtime Environment 7 or 8 or OpenJDK 7 is recommended.
      Note: If using Oracle Java 7, you must use at least 1.7.0_25. If using Oracle Java 8, you must use at least 1.8.0_40. In some cases, using JDK 1.8 causes minor performance degradation compared to JDK 1.7.
    • Python 2.6 (minimum); 2.7 (recommended).
  • On some versions of Mac OS X, you may need to install readline: easy_install readline.
  • RedHat-compatible distributions:
    • If installing on a 64-bit Oracle Linux distribution, first install the 32-bit versions of glibc libraries.
    • If you are using an earlier RHEL-based Linux distribution, such as CentOS-5, you might need to replace the Snappy compression/decompression library; see the DataStax Enterprise 4.5.0 Release Notes.
    • Before installing, make sure EPEL (Extra Packages for Enterprise Linux) is installed. See Installing EPEL on RHEL OS 5.x.
Hardware requirements
Requirement Minimum Production
CPUs 2 16
Memory 8 GB 24 GB
Data directory 20 GB 200 GB
Commit log directory 20 GB 200 GB
Saved caches directory 20 GB 200 GB
Logs directory 20 GB 200 GB
Production requirements depend on the volume of data and workload.

Also see Recommended production settings and the DataStax Enterprise Reference Architecture white paper.

Procedure

These steps install DataStax Enterprise, the DataStax Agent, and OpsCenter (optional). After installing, you must configure and start DataStax Enterprise.

In a terminal window:

  1. Verify that a required version of Java is installed:
    java -version

    If not Oracle Java 7, Oracle Java 8, or OpenJDK 7, see Installing Oracle JDK or the OpenJDK documentation.

    Important: Package management tools do not install Oracle Java.
  2. Download the tarball from the download-previous-versions page using the DataStax Academy account credentials you created on the registration page. Be sure to use your registration email address, not your username.
  3. Unpack the distribution:
    tar -xzvf dse-4.8.X.tar.gz
    Note: Be sure to change X to an actual version number. To view the available versions, see the Release notes. The latest version of DataStax Enterprise 4.8 is 4.8.16.

    The files are extracted into the dse-4.8.X directory.

  4. Optional: Download and extract the OpsCenter tarball:
    curl -L https://downloads.datastax.com/community/opscenter.tar.gz | tar xz

    For production installations, DataStax recommends installing the OpsCenter separate from the cluster. See the OpsCenter documentation.

  5. To use the default data and logging directory locations, create and change ownership as follows:
    • /var/lib/cassandra
    • /var/log/cassandra
    • /var/lib/spark
    • /var/log/spark
    sudo mkdir -p /var/lib/cassandra; sudo chown -R  $USER:$GROUP /var/lib/cassandra
    $ sudo mkdir -p /var/log/cassandra; sudo chown -R  $USER:$GROUP /var/log/cassandra
    $ sudo mkdir -p /var/lib/spark; sudo chown -R  $USER:$GROUP /var/lib/spark
    $ sudo mkdir -p /var/log/spark; sudo chown -R  $USER:$GROUP /var/log/spark
  6. Optional: If you do not want to use the default data and logging directories, you can define your own directory locations:
    1. Make the directories for data and logging directories:
      mkdir install_location/dse-data
      $ cd dse-data
      $ mkdir commitlog
      $ mkdir saved_caches
    2. Go the directory containing the cassandra.yaml file:
      cd install_location/resources/cassandra/conf
    3. Edit the following lines in the cassandra.yaml file:
      The location of the cassandra.yaml file depends on the type of installation:
      Package installations /etc/dse/cassandra/cassandra.yaml
      Tarball installations install_location/resources/cassandra/conf/cassandra.yaml
      data_file_directories: install_location/dse-data
      commitlog_directory: install_location/dse-data/commitlog
      saved_caches_directory: install_location/dse-data/saved_caches
  7. Optional: If you do not want to use the default Spark directories, you can define your own directory locations:
    1. Make the directories for the Spark lib and log directories.
    2. Edit the spark-env.sh file to match the locations of your Spark lib and log directories, as described in Configuring Spark nodes.
      The default location of the spark-env.sh file depends on the type of installation:
      Installer-Services and Package installations /etc/dse/spark/spark-env.sh
      Installer-No Services and Tarball installations install_location/resources/spark/conf/spark-env.sh

Results

DataStax Enterprise is ready for configuration.

What's next