Creating and updating a keyspace

Creating a keyspace is the CQL counterpart to creating an SQL database.

Creating a keyspace is the CQL counterpart to creating an SQL database, but a little different. The Cassandra keyspace is a namespace that defines how data is replicated on nodes. Typically, a cluster has one keyspace per application. Replication is controlled on a per-keyspace basis, so data that has different replication requirements typically resides in different keyspaces. Keyspaces are not designed to be used as a significant map layer within the data model. Keyspaces are designed to control data replication for a set of tables.

When you create a keyspace, specify a strategy class for replicating keyspaces. Using the SimpleStrategy class is fine for evaluating Apache Cassandra. For production use or for use with mixed workloads, use the NetworkTopologyStrategy class.

To use NetworkTopologyStrategy for evaluation purposes using, for example, a single node cluster, specify the default data center name. To determine the default data center name, use the nodetool status command. On Linux, for example, in the installation directory:

$ bin/nodetool status

The output is:

Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens  Owns (effective)  Host ID      Rack
UN  127.0.0.1  41.62 KB   256     100.0%            75dcca8f...  rack1
To use NetworkTopologyStrategy for production use, you need to change the default snitch, SimpleSnitch, to a network-aware snitch, define one or more data center names in the snitch properties file, and use the data center name(s) to define the keyspace; otherwise, Cassandra will fail to complete any write request, such as inserting data into a table, and log this error message:
Unable to complete request: one or more nodes were unavailable.

You cannot insert data into a table in keyspace that uses NetworkTopologyStrategy unless you define the data center names in the snitch properties file or you use a single data center named datacenter1.