Recommended settings for DataStax Enterprise (DSE) Docker containers
DataStax provides the following general recommendations for running DataStax Enterprise (DSE) Docker containers. You might need to adapt the settings for your use case or environment. Test these settings in isolation before applying them in production.
Container architecture for replication and high availability
DSE achieves resilience and high availability through a group of nodes that replicate data across the cluster. This replication ensures that if any individual node fails, access to data is not lost and performance is maintained. However, in a containerized environment, running multiple DSE nodes on the same physical hardware will introduce a single point of failure.
|
To avoid a single point of failure, only run a single DataStax container on a DSE cluster per Docker host. If running multiple DataStax containers on a single Docker host, ensure that the containers are in different DSE clusters. |
Hardware and system settings
- Docker container resource requirements
-
For minimum container resource requirements, follow the capacity planning guidance for selecting hardware for production environments.
- Optimize disk settings
-
The default SSD configurations on most Linux distributions are not optimal. For recommended settings, see Optimize SSDs.
The optimum
readaheadsetting for RAID on SSDs (in Amazon EC2) is 8 KB, the same as it is for non-RAID SSDs. To optimize RAID settings for spinning disks on the host, DataStax recommendsreadaheadof 128 KB. For more information, see Optimize spinning disks. - Synchronize clocks
-
Because time is not namespaced in the Linux kernel, containers share the clock with the Docker host machine. Ensure that clocks are synchronized on the host machines and containers by configuring NTP or other methods on the host machines.
- Disable swap
-
Swapping must be disabled for performance and node stability.
If you disable swap on the Docker host, the host passes that setting to each container. For more information and instructions, see Disable swap.
Alternatively, to disable swap for specific containers, see Preventing a container from using SWAP.
- Disable CPU frequency scaling
-
To ensure optimal performance, don’t use governors that lower the CPU frequency. Instead, reconfigure all CPUs to use the
performancegovernor on the Docker hosts. For more information and instructions, see Disable CPU frequency scaling. - Disable THP on the Docker host
-
Disable
defragon the Docker host to avoid performance issues caused by Transparent Hugepages (THP) defragmentation of 4K chunks into 2 MB chunks. For more information and instructions, see Check Java Hugepages settings. - Increase user resource limits
-
All containers by default inherit user limits from the Docker daemon. In production environments, DSE expects the following
ulimitsettings:ulimit -n 100000 # nofile: max number of open files ulimit -l unlimited # memlock: maximum locked-in-memory address spaceTo configure user resource limits for Docker containers, do the following:
-
Check the Docker daemon defaults for
ulimits:docker run --rm ubuntu /bin/BASH -c 'ulimit -a' -
Configure
ulimitwhen starting Docker containers by appendingulimitoptions to thedocker runcommand:--ulimit nofile=100000:100000 --ulimit nproc=32768 --ulimit memlock=-1:-1 -
Enable
mlockby appending the--cap-addoption to thedocker runcommand:--cap-add=IPC_LOCKThis is required because DSE tries to lock memory using
mlock, but Docker disables memory lock by default. -
On the Docker host, get the value of
vm.max_map_count:cat /proc/sys/vm/max_map_count -
If it isn’t set to 1048575, add the following line to
/etc/sysctl.conf:vm.max_map_count = 1048575 -
Run
sysctl -pto propagate the changes.
For more information, see Set user resource limits.
-
- Configure heap settings
-
For each container in production environments, explicitly set the JVM heap size using the
JVM_EXTRA_OPTSenvironment variable with thedocker runcommand. For example, to use 16 GB for the JVM heap, use the following option:docker run -e JVM_EXTRA_OPTS="-Xms16g -Xmx16g"If not explicitly set, DSE sets the heap to 25 percent of the physical RAM of the Docker host, which is not optimal for performance and stability.
Host storage and resource recommendations
- Mount volumes to load custom configuration files
-
Use the DSE configuration volume to load custom configuration files without creating a custom image.
- Mount volumes to persist data
-
To avoid data loss when deleting and recreating containers, mount volumes to persist container data.
The DSE Docker container writes all node-specific data in the directories under
/var/lib/cassandra/by default. To persist this data, you must map the data directories inside the container to a directory on the host file system. To do this, you can use the-voption with thedocker runcommand, or use a volume driver. - Docker storage driver mode
-
If you use the Docker
devicemapperstorage driver, don’t use the defaultloop-lvmmode, which is only appropriate for testing. Instead, configuredocker-engineto usedirect-lvmmode, which is suitable for production environments. - Docker host VM resources for macOS and Microsoft Windows
-
On macOS and Windows, the default resources allocated to the Linux VM that runs Docker are generally insufficient for running DataStax containers, particularly in production or with simulated production-level workloads. Adjust these resources as appropriate to meet the requirements for your containers. For more information, see the Docker documentation.
Host networking
Because the default network settings in Docker (through Linux bridge) slows networking considerably, don’t use the default network settings in production environments.
Instead, use Docker host networking. This limits the number of nodes per Docker host to one, which is the recommended configuration to use in production.
To enable Docker host networking, append the --network host option to the docker run command, or use a plugin that can manage IP ranges across clusters of hosts:
docker run -d --network host --name container_name
Ports
Communication occurs on many different ports. Account for required communication and security for DSE ports when binding ports to the Docker host.
To allow remote hosts to access a DSE, DSE OpsCenter, or DataStax Studio container, map the DSE public port to a host port using the -p option with the docker run command.
For example, to allow access to a DSE OpsCenter container from a browser on a remote host, open port 8888:
docker run -e DS_LICENSE=accept --name my-opscenter -p 8888:8888 \
-d datastax/dse-opscenter
When mapping a container port to a local host port, make sure the host port is not in use by another container or the host.