About data distribution and replication

An overview of data distribution and replication.

In DataStax Enterprise, data distribution and replication go together. Data is organized by table and identified by a primary key, which determines which node the data is stored on. Replicas are copies of rows. When data is first written, it is also referred to as a replica.

Features affecting replication include:

  • Virtual nodes assigns data ownership to physical machines.
  • Partitioners distribute the data across the cluster.
  • Replication strategy determines the replicas for each row of data.
  • Snitches define the topology information that the replication strategy uses to place replicas.