About Apache CassandraDocumentation for developers and administrators on installing, configuring, and using the features and capabilities of Apache Cassandra scalable open source NoSQL database.
What's new in CassandraAn overview of new features in Cassandra.
CQLCassandra Query Language (CQL) is the default and primary interface into the Cassandra DBMS.
Understanding the architectureImportant topics for understanding Cassandra.
Architecture in briefEssential information for understanding and using Cassandra.
Internode communications (gossip)Cassandra uses a protocol called gossip to discover location and state information about the other nodes participating in a Cassandra cluster.
Data distribution and replicationHow data is distributed and factors influencing replication.
PartitionersA partitioner determines how data is distributed across the nodes in the cluster (including replicas).
SnitchesA snitch determines which data centers and racks nodes belong to.
Client requestsClient read or write requests can be sent to any node in the cluster because all nodes in Cassandra are peers.
Planning a cluster deploymentVital information about successfully deploying a Cassandra cluster.
InstallingVarious installation methods.
Installing the RHEL-based packagesInstall using Yum repositories on RHEL, CentOS, and Oracle Linux.
Installing the Debian and Ubuntu packagesInstall using APT repositories on Debian and Ubuntu.
Installing from the binary tarballInstall on all Linux-based platforms using a binary tarball.
Installing prior releases of DataStax CommunitySteps for installing the same version as other nodes in your cluster.
Uninstalling DataStax CommunitySteps for uninstalling Cassandra by install type.
Installing on cloud providersInstallation methods for the supported cloud providers.
Recommended production settingsRecommendations for production environments.
Initializing a clusterTopics for deploying a cluster.
Initializing a multiple node cluster (single data center)A deployment scenario for a Cassandra cluster with a single data center.
Initializing a multiple node cluster (multiple data centers)A deployment scenario for a Cassandra cluster with multiple data centers.
SecurityTopics for securing Cassandra.
Securing CassandraCassandra provides these security features to the open source community.
SSL encryptionTopics for using SSL in Cassandra.
Internal authenticationTopics for internal authentication.
Internal authorizationTopics about internal authorization.
Configuring firewall port accessWhich ports to open when nodes are protected by a firewall.
Enabling JMX authenticationThe default settings for Cassandra make JMX accessible only from localhost. To enable remote JMX connections, change the LOCAL_JMX setting in cassandra-env.sh.
Database internalsTopics about the Cassandra database.
Managing dataAn overview of Cassandra's storage structure.
Cassandra storage basicsUnderstanding how Casssandra stores data.
The write path of an updateA brief description of the write path of an update.
About deletesHow Cassandra deletes data and why deleted data can reappear.
About hinted handoff writesHow hinted handoff works and how it optimizes the cluster.
About readsHow Cassandra combines results from the active memtable and potentially mutliple SSTables to satisfy a read.
About transactions and concurrency controlA brief description about transactions and concurrency control.
About data consistencyHow up-to-date and synchronized a row of data is on all replicas.
Node and cluster configurationThe cassandra.yaml file is the main configuration file for Cassandra.
Configuring gossip settingsUsing the cassandra.yaml file to configure gossip.
Configuring the heap dump directoryAnalyzing the heap dump file can help troubleshoot memory problems.
Generating tokensIf not using virtual nodes (vnodes), you still need to calculate tokens for your cluster.
Configuring virtual nodesTopics about configuring virtual nodes.
Logging configurationAbout Cassandra logging functionality using Simple Logging Facade for Java (SLF4J) with log4j.
Commit log archive configurationCassandra provides commit log archiving and point-in-time recovery.
Using multiple network interfacesSteps for configuring Cassandra for multiple network interfaces or when using different regions in cloud implementations.
Hadoop supportCassandra support for integrating Hadoop with Cassandra.
Tuning Bloom filtersCassandra uses Bloom filters to determine whether an SSTable has data for a particular row.
Data cachingData caching topics.
Configuring memtable throughputConfiguring memtable throughput to improve write performance.
Configuring compactionSteps for configuring compaction. The compaction process merges keys, combines columns, evicts tombstones, consolidates SSTables, and creates a new index in the merged SSTable.
CompressionCompression maximizes the storage capacity of Cassandra nodes by reducing the volume of data on disk and disk I/O, particularly for read-dominated workloads.
Tuning Java resourcesConsider tuning Java resources in the event of a performance degradation or high memory consumption.
Purging gossip state on a nodeCorrecting a problem in the gossip state.
Repairing nodesNode repair makes data on a replica consistent with data on other nodes.
Adding or removing nodes, data centers, or clustersTopics for adding or removing nodes, data centers, or clusters.
Backing up and restoring dataCassandra backs up data by taking a snapshot of all on-disk data files (SSTable files) stored in the data directory.
Taking a snapshotSteps for taking a global snapshot or per node.
Deleting snapshot filesSteps to delete snapshot files.
Enabling incremental backupsSteps to enable incremental backups. When incremental backups are enabled, Cassandra hard-links each flushed SSTable to a backups directory under the keyspace data directory.
Restoring from a SnapshotMethods for restoring from a snapshot.
Restoring a snapshot into a new clusterSteps for restoring a snapshot by recovering the cluster into another newly created cluster.
Recovering from a single disk failure using JBODRecovering from a single disk failure in a disk array using JBOD.
Cassandra toolsTopics for Cassandra tools.
The nodetool utilityA command line interface for Cassandra for managing a cluster.
Cassandra bulk loader (sstableloader)Provides the ability to bulk load external data into a cluster, load existing SSTables into another cluster with a different number of nodes or replication strategy, and restore snapshots.
The sstablelevelreset utilityThe sstablelevelreset utility will reset the level to 0 on a given set of SSTables.
The cassandra utilityCassandra start-up parameters can be run from the command line (in Tarball installations) or specified in the cassandra-env.sh file (Package or Tarball installations).
The cassandra-stress toolA Java-based stress testing utility for benchmarking and load testing a Cassandra cluster.
The sstablescrub utilityAn offline version of nodetool scrub. This tool attempts to remove the corrupted parts while preserving non-corrupted data.
The sstablesplit utilityUse this tool to split SSTables files into multiple SSTables of a maximum designated size.
sstablekeysThe sstablekeys utility dumps table keys.
The sstableupgrade toolUpgrade the SSTables in the specified table or snapshot to match the currently installed version of Cassandra.
Starting and stopping CassandraTopics for starting and stopping Cassandra.
Install locationsInstall location topics.
Cassandra-CLI utility (deprecated)Cassandra stores storage configuration attributes in the system keyspace.
Moving data to/from other databasesSolutions for migrating from other databases.
Reads are getting slower while writes are still fastThe cluster's IO capacity is not enough to handle the write load it is receiving.
Nodes seem to freeze after some period of timeSome portion of the JVM is being swapped out by the operating system (OS).
Nodes are dying with OOM errorsNodes are dying with OutOfMemory exceptions.
Nodetool or JMX connections failing on remote nodesNodetool commands can be run locally but not on other nodes in the cluster.
View of ring differs between some nodesIndicates that the ring is in a bad state.
Insufficient user resource limits errors Insufficient resource limits may result in a number of errors in Cassandra and OpsCenter.
Cannot initialize class org.xerial.snappy.SnappyAn error may occur when Snappy compression/decompression is enabled although its library is available from the classpath.
Release notesRelease notes for DataStax Community.