Install DataStax Enterprise 6.9 using the binary tarball
Start with a bare metal or VM environment to quickly deploy DataStax Enterprise (DSE) 6.9 in a compact installation in a single directory. The single directory stores binaries, data, and logs. Any user can run DSE 6.9 on any supported Linux or OS/X platform.
Key facts to know about installing DataStax Enterprise:
-
These instructions apply to all versions of DSE 6.9. Review specific changes in the DSE 6.9 release notes.
-
A DSE binary tarball enables:
-
DataStax Enterprise to run as a stand-alone process.
-
A user to install with or without root permissions.
-
A deployment of DSE creates a |
-
The default location of the configuration files are:
-
spark-env.sh
:installation_location/resources/spark/conf/spark-env.sh
. -
cassandra.yaml
:installation_location/resources/cassandra/conf/cassandra.yaml
.
-
Prerequisites
-
Configure your operating system to use Java 11 and make it available on the executable path:
-
It is a requirement to set the $JAVA_HOME environment variable to point to Java 11 when running multiple Java runtime environments.
-
Recommended: Gain access to the latest build of a TCK (Technology Compatibility Kit) Certified OpenJDK version 11.
-
Supported: Oracle Java SE 11.0.x (JDK).
-
-
RedHat-compatible distributions require EPEL (Extra Packages for Enterprise Linux).
-
Python 3.8-3.11 required for running
cqlsh
.
Download and deploy DataStax Enterprise (DSE) 6.9
End User License Agreement (EULA). By downloading this DataStax product, you agree to the terms of the EULA. |
-
Verify that you have installed the required Java version 11:
-
Terminal window command
-
OpenJDK sample result
-
Oracle Java sample result
java -version
openjdk version “11.0.x” YYYY-MM-DD OpenJDK Runtime Environment (build 11.0.x+xx) OpenJDK 64-Bit Server VM (build 11.0.x+xx, mixed mode)
java version "11.0.x" YYYY-MM-DD LTS Java(TM) SE Runtime Environment 18.9 (build 11.0.x+xx-LTS) Java HotSpot(TM) 64-Bit Server VM (build 11.0.x+xx-LTS, mixed mode)
-
-
From a terminal window, install the
libaio
package that matches your environment:-
RHEL platform
-
Debian platform
sudo yum install libaio
sudo apt-get install libaio1
-
-
Download and extract the binary tarball files manually or use
curl
:-
Manual download and extract
-
Curl
download and extract
-
Extract the binary tarball files into the directory of your choice:
tar -xzvf dse-6.9.0-bin.tar.gz
During the manual method the shell history retains your password. To avoid this security issue, use
curl
with its --netrc or --netrc-file option.
curl --netrc -L https://downloads.datastax.com/enterprise/dse-6.9.0-bin.tar.gz | tar xz
The command downloads and extracts the files into the
dse-6.9.0
subdirectory. Choose to start DSE from this installation directory and store logs and data there too, or define your own locations. -
-
Use either the default data and logging directory locations or define your locations:
-
Default directory locations
-
Define your own directory locations
To use the default data and logging directory locations, create and change ownership for the following:
-
/var/lib/cassandra
-
/var/log/cassandra
sudo mkdir -p /var/lib/cassandra; sudo chown -R $USER:$GROUP /var/lib/cassandra && sudo mkdir -p /var/log/cassandra; sudo chown -R $USER:$GROUP /var/log/cassandra
-
In the installation location, make the directories for data and logging directories. For example:
mkdir dse-data && cd dse-data && mkdir data && mkdir commitlog && mkdir saved_caches && mkdir hints && mkdir cdc_raw
-
Change to the directory containing the
cassandra.yaml
file:cd installation_location/resources/cassandra/conf
-
Update the following lines in the
cassandra.yaml
file to match the custom locations:data_file_directories: - full_path_to_installation_location/dse-data/data commitlog_directory: full_path_to_installation_location/dse-data/commitlog saved_caches_directory: full_path_to_installation_location/dse-data/saved_caches hints_directory: full_path_to_installation_location/dse-data/hints cdc_raw_directory: full_path_to_installation_location/cdc_raw
-
-
-
To store logs and data in the installation location, use the environment variable
CASSANDRA_LOG_DIR
to specify the location of the logs directory:cd dse-6.9.0 CASSANDRA_LOG_DIR=`<pwd>`/logs bin/dse cassandra
-
Apply additional configurations to your DSE installation:
-
For production, be sure to change the
cassandra
user. Failing to do so is a security risk. See Adding a superuser login. -
DataStax Enterprise (DSE) provides several types of workloads; the default is transactional. See startup options for service or stand-alone installations.
-
Next Steps provides links to related tasks and information.
-
-
Optional: To use DSE analytics, choose either the default Spark data and logging directory locations or define your locations:
-
Default directory locations
-
Define your own Spark directory locations
To use the default Spark directory locations, create and change ownership for the following:
-
/var/lib/dsefs
-
/var/lib/spark
-
/var/log/spark
sudo mkdir -p /var/lib/dsefs; sudo chown -R $USER:$GROUP /var/lib/dsefs && sudo mkdir -p /var/lib/spark; sudo chown -R $USER:$GROUP /var/lib/spark && sudo mkdir -p /var/log/spark; sudo chown -R $USER:$GROUP /var/log/spark && sudo mkdir -p /var/lib/spark/rdd; sudo chown -R $USER:$GROUP /var/lib/spark/rdd && sudo mkdir -p /var/log/spark/master; sudo chown -R $USER:$GROUP /var/log/spark/master && sudo mkdir -p /var/log/spark/alwayson_sql; sudo chown -R $USER:$GROUP /var/log/spark/alwayson_sql && sudo mkdir -p /var/lib/spark/worker; sudo chown -R $USER:$GROUP /var/lib/spark/worker
-
In the installation_location, make the directories for data and logging directories. For example:
mkdir dsefs && mkdir spark && cd spark && mkdir log && mkdir rdd && mkdir worker && cd log && mkdir worker && mkdir master && mkdir alwayson_sql
+ . Change to the directory containing the
spark-env.sh
file:+
cd installation_location/resources/spark/conf
+ .. Uncomment and update the following lines in the
spark-env.sh
file:+
export SPARK_WORKER_DIR="full_path_to_installation_location/spark/worker" export SPARK_EXECUTOR_DIRS="full_path_to_installation_location/spark/rdd" export SPARK_WORKER_LOG_DIR="full_path_to_installation_location/spark/log/worker" export SPARK_MASTER_LOG_DIR="full_path_to_installation_location/spark/log/master" export ALWAYSON_SQL_LOG_DIR="full_path_to_installation_location/spark/log/alwayson_sql"
+ .. Change to the directory containing the dsefs_options file:
+
cd installation_location/resources/dse/conf
+ .. Uncomment and update the DSEFS directory in
dse.yaml
:+
work_dir: full_path_to_installation_location/dsefs
+ DSE 6.9 is ready for additional configuration. See Next Steps.
-
-
Optimizations
-
To run stress or performance tests, you may be required to do some additional set up on the machine running DSE. Particularly for vector search and SAI experiments, you need to set temporarily set user resource limits for DSE. Or, use the following settings to make permanent root user changes in the tarball installation configuration file
/etc/security/limits.conf
:<cassandra_user> - memlock unlimited <cassandra_user> - nofile 1048576 <cassandra_user> - nproc 32768 <cassandra_user> - as unlimited
-
Disable swap:
$ sudo swapoff --all
-
Install jemalloc to optimize memory allocations. DSE automatically sees the library if it is available.
-
-
Single-node cluster installations only:
-
Start DSE 6.9 from its installation directory:
bin/dse cassandra
For other start options, see Starting DataStax Enterprise as a stand-alone process.
-
Verify that DSE 6.9 is running from the installation directory:
-
Nodetool command
-
Results using vnodes
-
Results not using vnodes
bin/nodetool status
Datacenter: Cassandra ===================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 127.0.0.1 82.43 KB 128 ? 40725dc8-7843-43ae-9c98-7c532b1f517e rack1
Datacenter: Analytics ===================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Owns Host ID Token Rack UN 172.16.222.136 103.24 KB ? 3c1d0657-0990-4f78-a3c0-3e0c37fc3a06 1647352612226902707 rack1
-
-
Connect to DSE
DSE 6.9 is running!
Establish a connection to DSE 6.9 using the Cassandra Query Language (CQL) shell cqlsh
.
Next Steps
-
You must change or delete the
cassandra
user created on installation.
See Adding a superuser login. * Configure startup options: service or stand-alone. * If performing an upgrade, proceed to the Upgrade Guide. * Configuring DataStax Enterprise - Settings for DSE Advanced Security, In-Memory, DSE Advanced Replication, DSE Multi-Instance, DSE Tiered Storage, and more. * Default file locations for package installations * Default file locations for tarball installations * Changing logging locations after installation. * Starting and stopping DataStax Enterprise. * Preparing DataStax Enterprise for production. * Recommended production settings. * Plan and test DSE cluster deployments. * Configuring the heap dump directory to avoid server crashes. * DataStax Studio documentation. * Installing DataStax Enterprise drivers.