About DataStax EnterpriseDataStax Enterprise is a big data platform built on Apache Cassandra that manages real-time, analytics, and enterprise search data. DataStax Enterprise leverages Cassandra, Apache Hadoop, and Apache Solr to shift your focus from the data infrastructure to using your data strategically.
Using the in-memory optionDataStax Enterprise includes the in-memory option for storing data to and accessing data from memory exclusively.
UpgradingSee the DataStax Upgrade Guide.
CompressionConfigure data compression on a per-table basis to optimize performance of read-dominated tasks.
InstallingDataStax Enterprise installation methods include GUI or text mode, unattended command line or properties file, YUM and APT repository, and binary tarball.
Installing on RHEL-based systemsInstall DataStax Enterprise and OpsCenter using Yum repositories on RHEL-based systems.
Installing on Debian-based systemsInstall DataStax Enterprise and OpsCenter using APT repositories on Debian-based systems.
Installing the binary tarballInstall DataStax Enterprise on any Linux-based platform, including 32-bit platforms.
Installing on SUSEDataStax provides a binary tarball distribution for installing DataStax Enterprise on SUSE Linux.
On cloud providersInstall on Amazon EC2 or HP cloud.
Installing prior releasesSteps for installing the same version as other nodes in your cluster.
SecurityManaging security in DataStax Enterprise including authentication, encryption, auditing, permissions, and configuration.
Security managementDataStax Enterprise includes advanced data protection for enterprise-grade databases including internal authentication, object permissions, encryption, Kerberos authentication, and data auditing.
Authenticating with KerberosDataStax Enterprise authentication with Kerberos protocol uses tickets to prove identity for nodes that communicate over non-secure networks.
Client-to-node encryptionClient-to-node encryption protects data in flight from client machines to a database cluster.
Node-to-node encryptionNode-to-node encryption protects data that is transferred between nodes in a cluster using SSL (Secure Sockets Layer).
Server certificatesGenerate SSL certificates for client-to-node encryptions or node-to-node encryption.
Installing cqlsh securityInstall packages to use cqlsh with a Kerberized cluster.
Transparent data encryptionTransparent data encryption (TDE) protects at rest data. TDE requires a secure local file system to be effective.
Data auditingAuditing is implemented as a log4j-based integration.
Internal authenticationInternal authentication is based on Cassandra-controlled login accounts and passwords.
Managing object permissionsUse GRANT/REVOKE to grant or revoke permissions to access Cassandra data.
Configuring keyspace replicationThe system_auth and dse_security keyspaces store security authentication and authorization information.
Configuring firewall portsOpening the required ports to allow communication between the nodes.
Getting startedThe Hadoop component in DataStax Enterprise enables analytics to be run across DataStax Enterprise's distributed, shared-nothing architecture. Instead of using the Hadoop Distributed File System (HDFS), DataStax Enterprise uses Cassandra File System (CFS) keyspaces for the underlying storage layer.
Using the job tracker nodeDataStax Enterprise schedules a series of tasks on the analytics nodes for each MapReduce job that is submitted to the job tracker.
About the Cassandra File SystemA Hive or Pig analytics job requires a Hadoop file system to function. For use with DSE Hadoop, DataStax Enterprise provides a replacement for the Hadoop Distributed File System (HDFS) called the Cassandra File System (CFS).
Using the cfs-archive to store huge filesThe Cassandra File System (CFS) consists of two layers: cfs and cfs-archive. Using cfs-archive is recommended for long-term storage of huge files.
Using HiveDataStax Enterprise includes a Cassandra-enabled Hive MapReduce client.
ODBC driver for HiveThe DataStax ODBC Driver for Hive provides Windows users access to the information that is stored in DSE Hadoop.
Using MahoutDataStax Enterprise integrates Apache Mahout, a Hadoop component that offers machine learning libraries.
Using PigDataStax Enterprise includes a Cassandra File System (CFS) enabled Apache Pig Client to provide a high-level programming environment for MapReduce coding.
SqoopMigrating data using Sqoop topics.
Getting Started with SolrDataStax Enterprise supports Open Source Solr (OSS) tools and APIs, simplifying migration from Solr to DataStax Enterprise.
Supported and unsupported featuresSupported and unsupported DSE Search and Solr features.
Defining key Solr termsSolr terms include several names for an index of documents and configuration on a single node.
Installing Solr nodesInstalling and starting Solr nodes.
Solr tutorialSteps for setting up Cassandra and Solr for the tutorial.
Configuring SolrConfigure Solr Type mapping.
Creating an index for searchingRequirements and steps for creating a Solr index.
Using DSE Search/SolrA brief description and illustration of DSE Search.
Querying Solr dataA brief description about query Solr data.
Capacity planningUse a discovery process to develop a plan to ensure sufficient memory resources.
Mixing workloadsAbout using real-time (Cassandra), Hadoop, or search (Solr) nodes in the same cluster.
Common operationsTopics for using DSE Search.
Tuning DSE Search performanceTopics for performance tuning and solving performance degradation, high memory consumption, or other problem swith DataStax Enterprise Search nodes.
DSE vs. Open sourceA comparison of DSE Search and Open Source Solr.
Request processing and data transformationUse the custom update request processor (URP) to extend the Solr URP. Use the field input/output transformer API as an option to the input/output transformer support in OS Solr.
DeployingDeployment topics.
Production deployment planningProduction deployment planning requires knowledge of the initial volume of data to store and an estimate of the typical application workload.
Configuring replicationChoose a data partitioner and replica placement strategy.
Single data center deploymentA deployment scenario with a mixed workload cluster has only one data center for each type of workload.
Multiple data center deploymentA deployment scenario with a mixed workload cluster has more than one data center for each type of node.
Single-token architecture deploymentUse single-token architecture deployment when you are not using virtual nodes (vnodes).
Calculating tokens Tokens assign a range of data to a particular node within a data center.
Expanding an AMI clusterTo expand your EC2 implementations, use OpsCenter to provision a new cluster, add a new cluster, or add nodes to a cluster.
DataStax Enterprise toolsTools include dse commands, dsetool, dfs-stress tool, pre-flight check, yaml_diff, and the Cassandra bulk loader.
The dse commandsTable of dse commands for using DataStax Enterprise
The dsetoolUse the dsetool utility for Cassandra File System (CFS) and Hadoop-related tasks, such as managing the job tracker, checking the CFS, and listing node subranges of data in a keyspace.
Configuring the disk health checkerWays to enable, disable, and use the disk health checker>
Pre-flight check and yaml_diff toolsThe pre-flight check tool is available for packaged installations. This collection of tests can be run on a node to detect and fix a configuration. The yaml_diff tool filters differences between cassandra.yaml files.
Moving data to/from other databasesDataStax offers several solutions for migrating from other databases.
ReferenceReference topics.
Installing glibc on Oracle LinuxTo install DSE on Oracle Enterprise Linux 6.x and later, install the 32-bit versions of the glibc libraries.
Tarball file locationsLocations when DataStax Enterprise was installed from a tarball.
Package file locationsLocations when DataStax Enterprise was installed from a package.
Configuration (dse.yaml)The configuration file for Kerberos authentication, purging of expired data from the Solr indexes, and setting Solr inter-node communication.
Starting and stopping DSEStarting and stopping DataStax Enterprise as a service or stand-alone process.
TroubleshootingTroubleshooting examples are useful to discover and resolve problems with DSE. Also check the Cassandra troubleshooting documentation.
Cassandra Log4j appenderDataStax Enterprise allows you to stream your web and application log information into a database cluster via Apache log4j.
Release notesRelease notes for DataStax Enterprise 4.0.x.
Using the docsDescribes navigation icons and provides search tips and links to other resources.