Installing DataStax Enterprise 5.1 using the binary tarball
Instructions for installing DataStax Enterprise 5.1 on any supported Linux-based platform.
Use these instructions for installing DataStax Enterprise (DSE) 5.1 on Linux-based platforms using a binary tarball.
Some things to know about installing DSE
- The latest version of DataStax Enterprise 5.1 is 5.1.20.
- When installed from the binary tarball, DataStax Enterprise runs as a stand-alone process.
- This procedure installs DataStax Enterprise 5.1 and the developer related tools:
Javadoc, DataStax Enterprise demos, DataStax Studio, and the DSE Graph Loader.
It does not install OpsCenter, DataStax Agent, Studio, or Graph Loader.
cassandra.yaml
The location of the cassandra.yaml file depends on the type of installation:
Package installations |
/etc/dse/cassandra/cassandra.yaml |
Tarball installations |
installation_location/resources/cassandra/conf/cassandra.yaml |
spark-env.sh
The default location of the spark-env.sh file depends on the type of installation:
Package installations |
/etc/dse/spark/spark-env.sh |
Tarball installations |
installation_location/resources/spark/conf/spark-env.sh |
dse.yaml
The location of the dse.yaml file depends on the type of installation:
Package installations |
/etc/dse/dse.yaml |
Tarball installations |
installation_location/resources/dse/conf/dse.yaml |
Prerequisites
- A supported platform.
- Configure your operating system to use the latest version of Java
8:
- Recommended. The latest build of a TCK (Technology Compatibility Kit) Certified OpenJDK version 8. For example, OpenJDK 8 (1.8.0_151 minimum). DataStax's recommendation changed due to the end of public updates for Oracle JRE/JDK 8. See Oracle Java SE Support Roadmap.
- Supported. Oracle Java SE 8 (JRE or JDK) (1.8.0_151 minimum)
- RedHat-compatible distributions require EPEL (Extra Packages for Enterprise Linux).
- Python 2.7.x (For older RHEL distributions, see Installing Python 2.7 on older RHEL-based package installations.)
Hardware requirements
Procedure
In a terminal window:
-
Verify that a required version of Java is installed:
java -version
Note: DataStax recommends the latest build of a Technology Compatibility Kit (TCK) Certified OpenJDK version 8.If OpenJDK, the results should look like:
openjdk version "1.8.0_171" OpenJDK Runtime Environment (build 1.8.0_171-8u171-b11-0ubuntu0.16.04.1-b11) OpenJDK 64-Bit Server VM (build 25.171-b11, mixed mode)
If Oracle Java, the results should look like:
java version "1.8.0_181" Java(TM) SE Runtime Environment (build 1.8.0_181-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
If not OpenJDK 8 or Oracle Java 8, see Installing the JDK.
-
When installing from the binary tarball, you can either download the tarball
and then extract the files, or use curl.
- Download and extract the tarball:
- Download the tarball from Download DataStax Enterprise.
- Extract the
files:
tar -xzvf dse-version_number-bin.tar.gz
For example:
tar -xzvf dse-5.1.20-bin.tar.gz
- Use curl to install the selected version:CAUTION: If you choose this method, your password is retained in the shell history. To avoid this security issue, DataStax recommends using curl with the --netrc or --netrc-fileDownload and extract the tarball using curl:
curl -L https://downloads.datastax.com/enterprise/dse-version_number-bin.tar.gz | tar xz
For example:curl -L https://downloads.datastax.com/enterprise/dse-5.1.20-bin.tar.gz | tar xz
The files are downloaded and extracted into the 5.1 directory.
- Download and extract the tarball:
-
You can use either the default data and logging directory locations or define
your locations:
- To use the default data and logging directory locations, create and
change ownership for the
following:
sudo mkdir -p /var/lib/cassandra; sudo chown -R $USER:$GROUP /var/lib/cassandra && sudo mkdir -p /var/log/cassandra; sudo chown -R $USER:$GROUP /var/log/cassandra && sudo mkdir -p /var/lib/dsefs; sudo chown -R $USER:$GROUP /var/lib/dsefs &&
- To define your own data and logging directory locations:
- In the installation_location, make the
directories for data and logging directories. For example:
mkdir installation_location/dse-data && cd dse-data && mkdir data && mkdir commitlog && mkdir saved_caches && mkdir hints && mkdir cdc_raw
- Go the directory containing the
cassandra.yaml
file:
cd installation_location/resources/cassandra/conf
- Edit the following lines in the
cassandra.yaml
file:
data_file_directories: full_path_to_installation_location/dse-data/data commitlog_directory: full_path_to_installation_location/dse-data/commitlog saved_caches_directory: full_path_to_installation_location/dse-data/saved_caches hints_directory: full_path_to_installation_location/dse-data/hints cdc_raw_directory: full_path_to_installation_location/cdc/raw
- In the installation_location, make the
directories for data and logging directories. For example:
- Optional: Define your own Spark directories:
- Make the directories for the Spark lib and log directories.
- Edit the spark-env.sh file to match the locations of your Spark lib and log directories, as described in Configuring Spark nodes.
- Make a directory for the DSEFS data directory and set its location in dsefs_options.
- To use the default data and logging directory locations, create and
change ownership for the
following:
-
You can use either the default data and logging directory locations or define
your locations:
- Default directory locations: If you want to use the default data
and logging directory locations, create and change ownership for the following:
- /var/lib/cassandra
- /var/log/cassandra
sudo mkdir -p /var/lib/cassandra; sudo chown -R $USER:$GROUP /var/lib/cassandra && sudo mkdir -p /var/log/cassandra; sudo chown -R $USER:$GROUP /var/log/cassandra
- Define your own directory locations: If you want to define your
own data and logging directory locations:
- In the installation_location, make the
directories for data and logging directories. For
example:
mkdir dse-data; chown -R $USER:$GROUP dse-data && cd dse-data && mkdir commitlog; chown -R $USER:$GROUP commitlog && mkdir saved_caches; chown -R $USER:$GROUP saved_caches && mkdir hints; chown -R $USER:$GROUP hints && mkdir cdc_raw; chown -R $USER:$GROUP cdc_raw
- Go the directory containing the
cassandra.yaml
file:
cd installation_location/resources/cassandra/conf
- Update the following lines in the
cassandra.yaml file to match the
custom
locations:
data_file_directories: - full_path_to_installation_location/dse-data commitlog_directory: full_path_to_installation_location/dse-data/commitlog saved_caches_directory: full_path_to_installation_location/dse-data/saved_caches hints_directory: full_path_to_installation_location/dse-data/hints cdc_raw_directory: full_path_to_installation_location/cdc_raw
- In the installation_location, make the
directories for data and logging directories. For
example:
- Default directory locations: If you want to use the default data
and logging directory locations, create and change ownership for the following:
- Optional:
If using DSE analytics, you can use either the default Spark data and logging
directory locations or define your locations:
- Default directory locations: If you want to use the default Spark
directory locations, create and change ownership for the following:
- /var/lib/dsefs
- /var/lib/spark
- /var/log/spark
sudo mkdir -p /var/lib/dsefs; sudo chown -R $USER:$GROUP /var/lib/dsefs && sudo mkdir -p /var/lib/spark; sudo chown -R $USER:$GROUP /var/lib/spark && sudo mkdir -p /var/log/spark; sudo chown -R $USER:$GROUP /var/log/spark && sudo mkdir -p /var/lib/spark/rdd; sudo chown -R $USER:$GROUP /var/lib/spark/rdd && sudo mkdir -p /var/log/spark/master; sudo chown -R $USER:$GROUP /var/log/spark/master && sudo mkdir -p /var/log/spark/alwayson_sql; sudo chown -R $USER:$GROUP /var/log/spark/alwayson_sql && sudo mkdir -p /var/lib/spark/worker; sudo chown -R $USER:$GROUP /var/lib/spark/worker
- Define your own directory locations: If you want to define your
own Spark directory locations:
- In the installation_location, make the
directories for data and logging directories. For
example:
mkdir dsefs; chown -R $USER:$GROUP dsefs && mkdir spark; chown -R $USER:$GROUP spark && cd spark && mkdir log; chown -R $USER:$GROUP log && mkdir rdd; chown -R $USER:$GROUP rdd && mkdir worker; chown -R $USER:$GROUP worker && cd log && mkdir worker; chown -R $USER:$GROUP worker && mkdir master; chown -R $USER:$GROUP master && mkdir alwayson_sql; chown -R $USER:$GROUP alwayson_sql
- Go the directory containing the
spark-env.sh
file:
cd installation_location/resources/spark/conf
- Uncomment and update the following lines in the
spark-env.sh
file:
export SPARK_WORKER_DIR="full_path_to_installation_location/spark/worker" export SPARK_EXECUTOR_DIRS="full_path_to_installation_location/spark/rdd" export SPARK_WORKER_LOG_DIR="full_path_to_installation_location/spark/log/worker" export SPARK_MASTER_LOG_DIR="full_path_to_installation_location/spark/log/master" export ALWAYSON_SQL_LOG_DIR="full_path_to_installation_location/spark/log/alwayson_sql"
- Go to the directory containing the dsefs_options
file:
cd installation_location/resources/dse/conf
- Uncomment and update the DSEFS directory in
dse.yaml:
work_dir: full_path_to_installation_location/dsefs
- In the installation_location, make the
directories for data and logging directories. For
example:
Result
DataStax Enterprise is ready for additional configuration:
- For production, be sure to change the
cassandra
user. Failing to do so is a security risk. See Creating superuser accounts. - DataStax Enterprise provides several types of workloads (default is transactional). See startup options for service or stand-alone installations.
- What's next below provides links to related tasks and information.
- Default directory locations: If you want to use the default Spark
directory locations, create and change ownership for the following:
- Optional:
Single-node cluster installations only:
What's next
- You must change or delete the
cassandra
user created on installation. See Creating superuser accounts. - Configure startup options: service | stand-alone.
- If performing an upgrade, go to the next step in the Upgrade Guide.
- Configuring DataStax Enterprise - Settings for DSE Advanced Security, In-Memory, DSE Advanced Replication, DSE Multi-Instance, DSE Tiered Storage, and more.
- Configuration and log file locations - Services and package installations.
- Configuration and log file locations - No Services and tarball installations.
- Changing logging locations after installation.
- Starting and stopping DataStax Enterprise.
- Preparing DataStax Enterprise for production.
- Recommended production settings.
- Planning and testing DSE and Apache Cassandra cluster deployments.
- Configuring the heap dump directory to avoid server crashes.
- DataStax Studio documentation.