Installing DataStax Enterprise 5.1 using the binary tarball
Use these instructions for installing DataStax Enterprise (DSE) 5.1 on Linux-based platforms using a binary tarball.
Some things to know about installing DSE
-
When installed from the binary tarball, DataStax Enterprise runs as a stand-alone process.
-
This procedure installs DataStax Enterprise 5.1. It does not install the developer related tools: OpsCenter, DataStax Agent, DataStax Studio, or the DSE Graph Loader.
Prerequisites
-
Configure your operating system to use the latest version of Java 8:
-
Recommended. The latest build of a TCK (Technology Compatibility Kit) Certified OpenJDK version 8. For example, OpenJDK 8 (1.8.0_151 minimum). DataStax’s recommendation changed due to the end of public updates for Oracle JRE/JDK 8. See Oracle Java SE Support Roadmap.
-
Supported. Oracle Java SE 8 (JRE or JDK) (1.8.0_151 minimum)
-
-
RedHat-compatible distributions require EPEL (Extra Packages for Enterprise Linux).
-
Python 2.7.x or 3.6+. Both are supported for
cqlsh
. For older RHEL distributions, see Installing Python 2.7 on older RHEL-based package installations.
Procedure
End User License Agreement (EULA). By downloading this DataStax product, you agree to the terms of the EULA. |
In a terminal window:
-
Verify that a required version of Java is installed:
java -version
DataStax recommends the latest build of a Technology Compatibility Kit (TCK) Certified OpenJDK version 8.
If OpenJDK, the results should look like:
openjdk version "1.8.0_171" OpenJDK Runtime Environment (build 1.8.0_171-8u171-b11-0ubuntu0.16.04.1-b11) OpenJDK 64-Bit Server VM (build 25.171-b11, mixed mode)
If Oracle Java, the results should look like:
java version "1.8.0_181" Java(TM) SE Runtime Environment (build 1.8.0_181-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
If not OpenJDK 8 or Oracle Java 8, see Installing the JDK.
-
When installing from the binary tarball, you can either download the tarball and then extract the files, or use
curl
.-
Download and extract the tarball:
The latest version is 5.1. To view the available versions, see the Release notes.
-
Download the tarball from Download DataStax Enterprise.
-
Extract the files:
tar -xzvf dse-version_number-bin.tar.gz
For example:
tar -xzvf dse-5.1-bin.tar.gz
-
Use
curl
to install the selected version:If you choose this method, your password is retained in the shell history. To avoid this security issue, DataStax recommends using
curl
with the --netrc or --netrc-fileDownload and extract the tarball using
curl
:curl -L https://downloads.datastax.com/enterprise/dse-version_number-bin.tar.gz | tar xz
For example:
curl -L https://downloads.datastax.com/enterprise/dse-5.1-bin.tar.gz | tar xz
-
-
The files are downloaded and extracted into the 5.1 directory.
-
-
You can use either the default data and logging directory locations or define your locations:
-
To use the default data and logging directory locations, create and change ownership for the following:
sudo mkdir -p /var/lib/cassandra; sudo chown -R $USER:$GROUP /var/lib/cassandra && sudo mkdir -p /var/log/cassandra; sudo chown -R $USER:$GROUP /var/log/cassandra && sudo mkdir -p /var/lib/dsefs; sudo chown -R $USER:$GROUP /var/lib/dsefs &&
-
To define your own data and logging directory locations:
-
In the <installation_location>, make the directories for data and logging directories. For example:
mkdir <installation_location>/dse-data && cd dse-data && mkdir data && mkdir commitlog && mkdir saved_caches && mkdir hints && mkdir cdc_raw
-
Go the directory containing the
cassandra.yaml
file:Where is the
cassandra.yaml
file?The location of the
cassandra.yaml
file depends on the type of installation:Installation Type Location Package installations + Installer-Services installations
/etc/dse/cassandra/cassandra.yaml
Tarball installations + Installer-No Services installations
<installation_location>/resources/cassandra/conf/cassandra.yaml
cd installation_location/resources/cassandra/conf
-
Edit the following lines in the
cassandra.yaml
file:data_file_directories: full_path_to_installation_location/dse-data/data commitlog_directory: full_path_to_installation_location/dse-data/commitlog saved_caches_directory: full_path_to_installation_location/dse-data/saved_caches hints_directory: full_path_to_installation_location/dse-data/hints cdc_raw_directory: full_path_to_installation_location/cdc/raw
-
-
Optional: Define your own Spark directories:
-
Make the directories for the Spark
lib
andlog
directories. -
Edit the
spark-env.sh
file to match the locations of your Sparklib
andlog
directories, as described in Configuring Spark nodes. — .Where is thespark-env.sh
file? [%collapsible] ===== The default location of thespark-env.sh
file depends on the type of installation:[cols="2*",options="header",subs="quotes"] |=== |Installation Type |Location
|Package installations + Installer-Services installations |`/etc/dse/spark/spark-env.sh`
|Tarball installations + Installer-No Services installations |`<installation_location>/resources/spark/conf/spark-env.sh` |=== ===== --
-
Make a directory for the DSEFS data directory and set its location in
dsefs_options
.
-
-
-
-
You can use either the default data and logging directory locations or define your locations:
-
Default directory locations: If you want to use the default data and logging directory locations, create and change ownership for the following:
-
/var/lib/cassandra
-
/var/log/cassandra
sudo mkdir -p /var/lib/cassandra; sudo chown -R $USER:$GROUP /var/lib/cassandra && sudo mkdir -p /var/log/cassandra; sudo chown -R $USER:$GROUP /var/log/cassandra
-
-
Define your own directory locations: If you want to define your own data and logging directory locations:
-
In the <installation_location>, make the directories for data and logging directories. For example:
mkdir dse-data; chown -R $USER:$GROUP dse-data && cd dse-data && mkdir commitlog; chown -R $USER:$GROUP commitlog && mkdir saved_caches; chown -R $USER:$GROUP saved_caches && mkdir hints; chown -R $USER:$GROUP hints && mkdir cdc_raw; chown -R $USER:$GROUP cdc_raw
-
Go the directory containing the
cassandra.yaml
file:cd installation_location/resources/cassandra/conf
-
Update the following lines in the
cassandra.yaml
file to match the custom locations:data_file_directories: - full_path_to_installation_location/dse-data commitlog_directory: full_path_to_installation_location/dse-data/commitlog saved_caches_directory: full_path_to_installation_location/dse-data/saved_caches hints_directory: full_path_to_installation_location/dse-data/hints cdc_raw_directory: full_path_to_installation_location/cdc_raw
-
-
-
Optional: If using DSE analytics, you can use either the default Spark data and logging directory locations or define your locations:
-
Default directory locations: If you want to use the default Spark directory locations, create and change ownership for the following:
-
/var/lib/dsefs
-
/var/lib/spark
-
/var/log/spark
sudo mkdir -p /var/lib/dsefs; sudo chown -R $USER:$GROUP /var/lib/dsefs && sudo mkdir -p /var/lib/spark; sudo chown -R $USER:$GROUP /var/lib/spark && sudo mkdir -p /var/log/spark; sudo chown -R $USER:$GROUP /var/log/spark && sudo mkdir -p /var/lib/spark/rdd; sudo chown -R $USER:$GROUP /var/lib/spark/rdd && sudo mkdir -p /var/log/spark/master; sudo chown -R $USER:$GROUP /var/log/spark/master && sudo mkdir -p /var/log/spark/alwayson_sql; sudo chown -R $USER:$GROUP /var/log/spark/alwayson_sql && sudo mkdir -p /var/lib/spark/worker; sudo chown -R $USER:$GROUP /var/lib/spark/worker
-
-
Define your own directory locations: If you want to define your own Spark directory locations:
-
In the <installation_location>, make the directories for data and logging directories. For example:
mkdir dsefs; chown -R $USER:$GROUP dsefs && mkdir spark; chown -R $USER:$GROUP spark && cd spark && mkdir log; chown -R $USER:$GROUP log && mkdir rdd; chown -R $USER:$GROUP rdd && mkdir worker; chown -R $USER:$GROUP worker && cd log && mkdir worker; chown -R $USER:$GROUP worker && mkdir master; chown -R $USER:$GROUP master && mkdir alwayson_sql; chown -R $USER:$GROUP alwayson_sql
-
Go the directory containing the
spark-env.sh
file:cd installation_location/resources/spark/conf
-
Uncomment and update the following lines in the
spark-env.sh
file:export SPARK_WORKER_DIR="full_path_to_installation_location/spark/worker" export SPARK_EXECUTOR_DIRS="full_path_to_installation_location/spark/rdd" export SPARK_WORKER_LOG_DIR="full_path_to_installation_location/spark/log/worker" export SPARK_MASTER_LOG_DIR="full_path_to_installation_location/spark/log/master" export ALWAYSON_SQL_LOG_DIR="full_path_to_installation_location/spark/log/alwayson_sql"
-
Go to the directory containing the
dsefs_options
file:cd installation_location/resources/dse/conf
-
Uncomment and update the DSEFS directory in
dse.yaml
:Where is the
dse.yaml
file?The location of the
dse.yaml
file depends on the type of installation:Installation Type Location Package installations + Installer-Services installations
/etc/dse/dse.yaml
Tarball installations + Installer-No Services installations
<installation_location>/resources/dse/conf/dse.yaml
work_dir: full_path_to_installation_location/dsefs
-
Result
DataStax Enterprise is ready for additional configuration:
-
For production, be sure to change the
cassandra
user. Failing to do so is a security risk. See Creating superuser accounts. -
DataStax Enterprise provides several types of workloads (default is transactional). See startup options for service or stand-alone installations.
-
Next steps below provides links to related tasks and information.
-
-
Optional: Single-node cluster installations only:
-
Start DataStax Enterprise from the installation directory:
bin/dse cassandra
For other start options, see Starting DataStax Enterprise as a stand-alone process.
-
From the installation directory, verify that DataStax Enterprise is running:
bin/nodetool status
Results using vnodes:
Datacenter: Cassandra ===================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 127.0.0.1 82.43 KB 128 ? 40725dc8-7843-43ae-9c98-7c532b1f517e rack1
Results not using vnodes:
Datacenter: Analytics ===================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Owns Host ID Token Rack UN 172.16.222.136 103.24 KB ? 3c1d0657-0990-4f78-a3c0-3e0c37fc3a06 1647352612226902707 rack1
-
Next steps
-
You must change or delete the
cassandra
user created on installation. See Creating superuser accounts. -
Configure startup options: service | stand-alone.
-
If performing an upgrade, go to the next step in the Upgrade Guide.
-
Configuring DataStax Enterprise - Settings for DSE Advanced Security, In-Memory, DSE Advanced Replication, DSE Multi-Instance, DSE Tiered Storage, and more.
-
Configuration and log file locations - Services and package installations.
-
Configuration and log file locations - No Services and tarball installations.
-
Changing logging locations after installation.
-
Planning and testing DSE and Apache Cassandra™ cluster deployments.
-
Configuring the heap dump directory to avoid server crashes.
-
DataStax Studio documentation.