Setting the replication factor for analytics keyspaces
Guidelines and steps to set the replication factor for keyspaces on DSE Analytics nodes.
Keyspaces and tables are automatically created when DSE Analytics nodes are started for the first time. The replication factor must be adjusted for these keyspaces in order for the analytics features to work properly and to avoid data loss.
The keyspaces used by DSE Analytics are the following:
dse_analytics
dse_leases
dsefs
"HiveMetaStore"
All analytics keyspaces are initially created with the
SimpleStrategy
replication strategy and a replication factor (RF)
of 1. Each of these must be updated in production environments to avoid data loss. After
starting the cluster, alter the keyspace to use the
NetworkTopologyStrategy
replication strategy with an appropriate
settings for the replication factor and datacenters. For most environments using DSE
Analytics, a suitable replication factor will be either 3 or the cluster size, whichever
is smaller.
For example, use a CQL statement to configure the dse_leases
keyspace for a replication factor of 3 in both DC1 and DC2 datacenters using
NetworkTopologyStrategy
:
ALTER KEYSPACE dse_leases WITH REPLICATION = { 'class': 'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3' };
dsefs
keyspace only contains metadata, not the data stored in
DSEFS. Each DSE Analytics datacenter should have its own
DSEFS instance.The datacenter name used is case-sensitive. If needed, use the dsetool
status
command to confirm the exact datacenter spelling.
After adjusting the replication factor, nodetool repair
must be run
on each node in the affected datacenters. For example to repair the altered keyspace
dse_leases
:
nodetool repair -full dse_leases
Repeat the above steps for each of the analytics keyspaces listed above. For more information see Changing keyspace replication strategy.