Creating and updating a keyspace

Creating a keyspace is the CQL counterpart to creating an SQL database.

Creating a keyspace is the CQL counterpart to creating an SQL database, but a little different. The Cassandra keyspace is a namespace that defines how data is replicated on nodes. Typically, a cluster has one keyspace per application. Replication is controlled on a per-keyspace basis, so data that has different replication requirements typically resides in different keyspaces. Keyspaces are not designed to be used as a significant map layer within the data model. Keyspaces are designed to control data replication for a set of tables.

When you create a keyspace, specify a strategy class for replicating keyspaces. Using the SimpleStrategy class is fine for evaluating Apache Cassandra™. For production use or for use with mixed workloads, use the NetworkTopologyStrategy class.

To use NetworkTopologyStrategy for evaluation purposes using, for example, a single node cluster, specify the default data center name. To determine the default data center name, use the nodetool status command. On Linux, for example, in the installation directory:

$ bin/nodetool status

The output is:

Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens  Owns (effective)  Host ID      Rack
UN  127.0.0.1  41.62 KB   256     100.0%            75dcca8f...  rack1

To use NetworkTopologyStrategy for production use, you need to change the default snitch, SimpleSnitch, to a network-aware snitch, define one or more data center names in the snitch properties file, and use the data center name(s) to define the keyspace; otherwise, Cassandra will fail to complete any write request, such as inserting data into a table, and log this error message:

Unable to complete request: one or more nodes were unavailable.

You cannot insert data into a table in keyspace that uses NetworkTopologyStrategy unless you define the data center names in the snitch properties file or you use a single data center named datacenter1.