Data distribution and replication
In Hyper-Converged Database (HCD), data distribution and replication go together. HCD organizes data by table and uses a primary key to identify unique records, helping determine the node on which to store data. HCD stores multiple copies of each row, called replicas, on different nodes to ensure reliability and fault tolerance. HCD writes data to the first replica. All replicas have equal importance; there is no primary replica.
Learn the important concept of how the data is distributed to the nodes in a cluster.
The following features affect replication:
-
The replication strategy determines which nodes store replicas for each row of data.
-
Consistent hashing distributes data across the cluster.
-
Virtual nodes assign data ownership to physical machines.
-
Partitioners distribute the data across the cluster.
-
Snitches provide topology information that the replication strategy uses to place replicas.