Installing DataStax Enterprise 4.7 using the binary tarball

Install DataStax Enterprise on any Linux-based platform.

For a complete list of supported platforms, see DataStax Enterprise Supported Platforms.

Important: DataStax Enterprise 4.7 uses Cassandra 2.1.

Prerequisites

  • All Linux platforms:
    • DataStax Academy registration email address and password.
    • Latest version of Oracle Java SE Runtime Environment 7 or 8 or OpenJDK 7 is recommended.
      Note: If using Oracle Java 7, you must use at least 1.7.0_25. If using Oracle Java 8, you must use at least 1.8.0_40. In some cases, using JDK 1.8 causes minor performance degradation compared to JDK 1.7.
    • Python 2.6 (minimum); 2.7 (recommended).
  • On some versions of Mac OS X, you might need to install readline: easy_install - readline.
  • RedHat-compatible distributions:
    • If installing on a 64-bit Oracle Linux distribution, first install the 32-bit versions of glibc libraries.
    • If you are using an earlier RHEL-based Linux distribution, such as CentOS-5, you might need to replace the Snappy compression/decompression library; see the DataStax Enterprise 4.5.0 Release Notes.
    • Before installing, make sure EPEL (Extra Packages for Enterprise Linux) is installed. See Installing EPEL on RHEL OS 5.x.
Hardware requirements
Requirement Minimum Production
CPUs 2 16
Memory 8GB 24GB
Data directory 20GB 200GB
Commit log directory 20GB 200GB
Saved caches directory 20GB 200GB
Logs directory 20GB 200GB

Also see Recommended production settings and the DataStax Enterprise Reference Architecture white paper.

The binary tarball runs as a stand-alone process.

Procedure

These steps install DataStax Enterprise, the DataStax Agent, and OpsCenter (optional). After installing, you must configure and start DataStax Enterprise.

In a terminal window:

Note: In the following commands, be sure to change X to an actual version number. To view the available versions, see the Release notes. The latest version of DataStax Enterprise 4.7 is 4.7.9.

  1. Verify that a required version of Java is installed:
    $ java -version

    If not Oracle Java 7, Oracle Java 8, or OpenJDK 7, see Installing Oracle JDK or the OpenJDK documentation.

    Important: Package management tools do not install Oracle Java.
  2. Download the tarball from the download-previous-versions page using the DataStax Academy account credentials you created on the registration page. Be sure to use your registration email address, not your username.

    For production installations, DataStax recommends installing the OpsCenter separate from the cluster. See the OpsCenter documentation.

    Attention: Depending on your environment, you might need to replace @ in your email address with %40 and escape any character in your password that is used in your operating system's command line. Examples: \! and \|.
  3. Unpack the distribution:
    $ tar -xzvf dse-4.7.X.tar.gz

    The files are extracted into the dse-4.7.X directory.

  4. If you do not have root access to the default directories locations, you can define your own directory locations as described in the following steps or change the ownership of the directories:
    • /var/lib/cassandra
    • /var/log/cassandra
    • /var/lib/spark
    • /var/log/spark
    $ sudo mkdir -p /var/lib/cassandra; sudo chown -R  $USER:$GROUP /var/lib/cassandra
    $ sudo mkdir -p /var/log/cassandra; sudo chown -R  $USER:$GROUP /var/log/cassandra
    $ sudo mkdir -p /var/lib/spark; sudo chown -R  $USER:$GROUP /var/lib/spark
    $ sudo mkdir -p /var/log/spark; sudo chown -R  $USER:$GROUP /var/log/spark
  5. Optional: If you do not want to use the default data and logging directories, you can define your own directory locations:
    1. Make the directories for data and logging directories:
      $ mkdir install_location/dse-data
      $ cd dse-data
      $ mkdir commitlog
      $ mkdir saved_caches
    2. Go the directory containing the cassandra.yaml file:
      $ cd install_location/resources/cassandra/conf
    3. Edit the following lines in the cassandra.yaml file:
      The location of the cassandra.yaml file depends on the type of installation:
      Package installations /etc/dse/cassandra/cassandra.yaml
      Tarball installations install_location/resources/cassandra/conf/cassandra.yaml
      data_file_directories: install_location/dse-data
      commitlog_directory: install_location/dse-data/commitlog
      saved_caches_directory: install_location/dse-data/saved_caches
  6. Optional: If you do not want to use the default Spark directories, you can define your own directory locations:
    1. Make the directories for the Spark lib and log directories.
    2. Edit the spark-env.sh file to match the locations of your Spark lib and log directories, as described in Configuring Spark nodes.
      The default location of the spark-env.sh file depends on the type of installation:
      Installer-Services and Package installations /etc/dse/spark/spark-env.sh
      Installer-No Services and Tarball installations install_location/resources/spark/conf/spark-env.sh
  7. Optional: Review the installation logs to verify the installation.
    Directories Description
    /usr/share/dse/backups/log_file_dir/copied_config_files.log Show Config File Overwrites
    /usr/share/dse/backups/log_file_dir/bitrock_installer.log View Installation Log
    /usr/share/dse/backups/log_file_dir/install_dependencies.log View Dependency Installation Log
    /usr/share/dse/backups/pfc_results.txt View Configuration Recommendations and Warnings (Preflight Check Results)
    /usr/share/dse View README
    /usr/share/dse Uninstall DataStax Enterprise

Results

DataStax Enterprise is ready for configuration.

What's next