About DSE 4.8DataStax Enterprise delivers Apache Cassandra™ in a database platform that meets the performance and availability demands of Internet-of-Things (IoT), Web, and Mobile applications. It provides enterprises a secure, fast, always-on database that remains operationally simple when scaled in a single datacenter or across multiple datacenters and clouds.
UpgradingSee the DataStax Upgrade Guide.
InstallingDataStax Enterprise installation methods include GUI or text mode, unattended command line or properties file, YUM and APT repository, and binary tarball.
Installer - GUI or Text modeDataStax Enterprise production installation or upgrade on any Linux-based platform using a graphical or text interface.
Installing DataStax Enterprise 4.8 on Linux without root permissions or on Mac OS XInstalling a cluster without root permissions on any Linux platform or Mac OS X using the DataStax Enterprise installer.
Installer - unattendedInstall DataStax Enterprise using the command line or properties file.
Other install methodsInstallation using YUM or APT packages or binary tarball.
On cloud providersInformation for installing DataStax Enterprise on CenturyLink Cloud, Google Compute Engine, Microsoft Azure, and Amazon EC2.
Installing EPEL on RHEL OS 5.xInstall Extra Packages for Enterprise Linux on RHEL OS 5.x.
Installing glibc on Oracle LinuxTo install DSE on Oracle Enterprise Linux 6.x and later, install the 32-bit versions of the glibc libraries.
UninstallingLaunch the uninstaller in the installation directory to uninstall DataStax Enterprise and DataStax Agent.
Starting & stopping DSEYou can start and stop DataStax Enterprise as a service or stand-alone process.
Starting as a serviceStarting the DataStax Enterprise service when DataStax Enterprise was installed from the DataStax Installer with the Services option or from a package.
Starting as a stand-alone processStarting the DataStax Enterprise process when DataStax Enterprise was installed from the DataStax Installer with the No Services option or from a tarball.
Stopping a nodeStopping DataStax Enterprise and the DataStax Agent on a node.
ConfigurationInformation about configuring DataStax Enterprise.
dse.yamldse.yaml is the primary DataStax Enterprise configuration file.
Configuring and using virtual nodes (vnodes)A description of virtual nodes (vnodes) and using them in different types of datacenters. Also steps for disabling vnodes.
File locations: Installer-Services and PackageLocations when installing from the DataStax All-in-One Installer with Services option or package installations.
File locations: Installer-No Services and TarballLocations when installing from the DataStax All-in-One Installer with No Services selected or tarball installations.
Changing logging locationsChanging logging locations after installation.
Collecting node health and indexing status scoresSteps to configure node health data collection, and commands to retrieve health and indexing scores.
DSE AnalyticsDataStax Enterprise analytics includes integration with Apache Spark, BYOH (bring your own Hadoop), and DSE Hadoop.
About DSE AnalyticsUse DSE Analytics to analyze huge databases. DSE Analytics includes integration with Apache Spark, BYOH (bring your own Hadoop), and DSE Hadoop.
DSE Analytics and Search integrationDSE SearchAnalytics clusters can use DSE Search queries within DSE Analytics jobs.
About the Cassandra File System (CFS)A Hive or Pig analytics job requires a Hadoop file system to function. For use with DSE Hadoop, DataStax Enterprise provides a replacement for the Hadoop Distributed File System (HDFS) called the Cassandra File System (CFS).
Configuring DSE AnalyticsGuidelines and steps to configure DSE Analytics.
Analyzing data using SparkSpark is the default mode when you start an analytics node in a packaged installation. Spark runs locally on each node.
Analyzing data using DSE HadoopYou can run analytics on Cassandra data using Hadoop that is integrated into DataStax Enterprise. The Hadoop component in DataStax Enterprise enables analytics to be run across the DataStax Enterprise distributed, shared-nothing architecture.
Analyzing data using external Hadoop systems (BYOH)DataStax Enterprise works with external Hadoop systems in a bring your own Hadoop (BYOH) model. Use BYOH when you want to run DSE with a separate Hadoop cluster, from a different vendor.
DSE SearchDataStax Enterprise Search (DSE Search) simplifies using search applications for data that is stored in a Cassandra database. DSE Search is an enterprise grade search solution that is scalable to work across multiple datacenters and the cloud.
About DSE SearchDSE Search (DataStax Enterprise Search) simplifies using search applications for data that is stored in a Cassandra database. DSE Search is an enterprise grade search solution that is scalable to work across multiple datacenters and the cloud.
Starting and stopping DSE SearchThe way you start a DSE Search node depends on the type of installation.
DSE Search architectureAn overview of DataStax Enterprise Search architecture.
QueriesDSE Search hooks into the Cassandra Command Line Interface (CLI), Cassandra Query Language (CQL) library, the cqlsh tool, existing Solr APIs, and Thrift APIs.
Working with advanced data types: tuples and UDTsGuidelines and steps for using DSE Search with advanced data types, including tuples and user-defined types (UDT).
Schema and data modelingTopics on how the Solr schema defines the relationship between data in a table and a Solr core.
Configuring DSE SearchDSE Search configuration tasks.
OperationsYou can run DSE Search on one or more nodes. Typical operations including configuration of nodes, policies, query routing, balancing loads, and communications.
Performance tuningTuning DSE Search in the event of performance degradation, high memory consumption, or other problems.
Update request processor and field transformerUse the custom update request processor (URP) to extend the Solr URP. Use the field input/output transformer API as an option to the input/output transformer support in Solr.
Unsupported features for DSE SearchUnsupported Cassandra and Solr features for DSE Search.
DSE Search vs. Open sourceDifferences between DSE Search and Open Source Solr (OSS).
DSE Search tutorials and demosUse the tutorials and demos to learn how to use DSE Search.
TroubleshootingTake appropriate action to troubleshoot inconsistent query results, trace Solr HTTP requests, and use Mbeans.
DSE Advanced SecurityDataStax Enterprise includes advanced data protection for enterprise-grade databases including LDAP authentication support, internal authentication, object permissions, encryption, Kerberos authentication, and data auditing.
About security managementAn overview of DataStax Enterprise security.
Authenticating with KerberosDataStax Enterprise authentication with Kerberos protocol uses tickets to prove identity for nodes that communicate over non-secure networks.
Authenticating with LDAPDataStax Enterprise supports LDAP authentication support for external LDAP services.
Setting up SSL for nodetool and dsetoolUsing nodetool and dsetool with SSL encryption.
EncryptionDataStax Enterprise supports encryption for in-flight data and at-rest data.
Running cqlshSample files are provided to help configure authentication for Kerberos, SSL, and Kerberos and SSL.
Configuring data auditingEnable logging for the audit logger on the node that is set up for logging. Logs provide detailed audit trails of cluster activity.
Internal authenticationInternal authentication is based on Cassandra-controlled login accounts and passwords.
Managing object permissionsUse GRANT/REVOKE to grant or revoke permissions to access Cassandra data.
Configuring keyspace replicationThe system_auth and dse_security keyspaces store security authentication and authorization information.
Configuring firewall portsIf a firewall runs on the nodes in the Cassandra or DataStax Enterprise cluster, open up ports to allow communication between the nodes.
Making /tmp non-executableIncrease security by mounting /tmp as non-executable.
DSE Management ServicesDSE Management Services automatically handle administration and maintenance tasks and assist with overall database cluster management.
Performance ServiceThe DataStax Enterprise Performance Service automatically collects and organizes performance diagnostic information into a set of data dictionary tables that can be queried with CQL.
Capacity ServiceAutomatically collects data about a cluster's operations, including Cassandra specific and platform specific (for example, disk metrics, network metrics), at both the node and column-family level (where applicable). Use OpsCenter to manage and perform trend analysis.
Repair ServiceThe Repair Service is designed to automatically keep data synchronized across a cluster. You can manage the Repair Service with OpsCenter or by using the command line.
DSE In-MemoryDataStax Enterprise includes DSE In-Memory for storing data to and accessing data exclusively from memory.
Creating or altering tables to use DSE In-MemoryUse CQL directives to create and alter tables to use DSE In-Memory.
Verifying table propertiesIn cqlsh, use the DESCRIBE command to view table properties.
Managing memoryYou must monitor and carefully manage available memory when using DSE In-Memory.
Backing up and restoring dataThe procedures for backing up and restoring data is the same procedure for DSE In-Memory data and on-disk data.
DeployingProduction deployment of DataStax Enterprise includes planning, configuration, and choosing how the data is divided across the nodes in the cluster.
Production deployment planningResources for deployment planning and recommendations for deployment.
Configuring replicationHow to set up DataStax Enterprise to store multiple copies of data on multiple nodes for reliability and fault tolerance.
Mixing workloadsOrganize nodes that run different workloads into virtual datacenters. Put analytic nodes in one datacenter, search nodes in another, and Cassandra real-time transactional nodes in another datacenter.
Single datacenter deployment per workload typeSteps for configuring nodes in a deployment scenario in a mixed workload cluster that has only one datacenter for each type of workload.
Multiple datacenter deployment per workload typeSteps for configuring nodes in a deployment scenario in a mixed workload cluster that has more than one datacenter for each type of node.
Single-token architecture deploymentSteps for deploying when you are not using virtual nodes (vnodes).
Calculating tokens for single-token architecture nodesWhen not using vnodes, use these steps to calculate tokens to evenly distribute data across a cluster.
Expanding an AMI clusterTo expand your EC2 implementations, use OpsCenter.
Migrating dataMigrate data using Sqoop or other methods.
Migrating data using SqoopFor DSE Hadoop, use Sqoop to transfer data between an RDBMS data source and Hadoop or between other data sources, such as NoSQL.
Migrating data using other methodsMigrating data to DataStax Enterprise solutions include the COPY command, the DSE Search/Solr Data Import Handler, and the Cassandra bulk loader.
Bulk saving data from Spark RDD to CassandraBulk saving data from Spark RDD to Cassandra bypasses the standard Cassandra write-path.
ToolsTools include dse commands, dsetool, cfs-stress tool, pre-flight check and yaml_diff tools, and the Cassandra bulk loader.
dse commandsThe dse commands provide additional controls for starting and using DataStax Enterprise.
dsetool utilityUse the dsetool utility for creating system keys, encrypting sensitive configuration information, and performing Cassandra File System (CFS) and Hadoop-related tasks, such as checking the CFS, and listing node subranges of data in a keyspace.
The cfs-stress toolThe cfs-stress tool performs stress testing of the Cassandra File System (CFS) layer.
Pre-flight check and yaml_diff toolsThe pre-flight check tool is available for packaged installations. This collection of tests can be run on a node to detect and fix a configuration. The yaml_diff tool filters differences between cassandra.yaml files.
TroubleshootingUse these troubleshooting examples to discover and resolve problems with DSE.
Release NotesDataStax Enterprise release notes cover cluster requirements, upgrade guidance, components, changes and enhancements, issues, and resolved issues for DataStax Enterprise 4.8 releases.
Cassandra changesDataStax Enterprise 4.8 includes production-certified Cassandra changes.