CREATE KEYSPACE

Define a new keyspace and its replica placement strategy.

Define a new keyspace and its replica placement strategy.

Synopsis 

CREATE ( KEYSPACE | SCHEMA )  keyspace_name 
WITH REPLICATION = map
AND DURABLE_WRITES = ( true | false )

map is a map collection, a JSON-style array of literals:

{ literal : literal, literal : literal ... }
Legend
  • Uppercase means literal
  • Lowercase means not literal
  • Italics mean optional
  • The pipe (|) symbol means OR or AND/OR
  • Ellipsis (...) means repeatable

A semicolon that terminates CQL statements is not included in the synopsis.

Description 

CREATE KEYSPACE creates a top-level namespace and sets the keyspace name, replica placement strategy class, replication factor, and the DURABLE_WRITES options for the keyspace.

When you configure NetworkTopologyStrategy as the replica placement strategy, you set up one or more virtual data centers. Use the same names for data centers as those used by the snitch. You assign different nodes, depending on the type of workload, to separate data centers. For example, assign Hadoop nodes to one data center and Cassandra real-time nodes to another. Segregating workloads ensures that only one type of workload is active per data center. The segregation prevents incompatibility problems between workloads, such as different batch requirements that affect performance.

A map of properties and values defines the two different types of keyspaces:

{ 'class' : 'SimpleStrategy', 'replication_factor' : <integer> };
{ 'class' : 'NetworkTopologyStrategy'[, '<data center>' : <integer>, '<data center>' : <integer>] . . . };
Table of map properties and values
Property Value Value Description
'class' 'SimpleStrategy' or 'NetworkTopologyStrategy' Required. The name of the replica placement strategy class for the new keyspace.
'replication_factor' <number of replicas> Required if class is SimpleStrategy; otherwise, not used. The number of replicas of data on multiple nodes.
'<first data center>' <number of replicas> Required if class is NetworkTopologyStrategy and you provide the name of the first data center. This value is the number of replicas of data on each node in the first data center. Example
'<next data center>' <number of replicas> Required if class is NetworkTopologyStrategy and you provide the name of the second data center. The value is the number of replicas of data on each node in the data center.
. . . . . . More replication factors for optional named data centers.

CQL property map keys must be lower case. For example, class and replication_factor are correct. Keyspace names are 32 or fewer alpha-numeric characters and underscores, the first of which is an alpha character. Keyspace names are case-insensitive. To make a name case-sensitive, enclose it in double quotation marks.

You can use the alias CREATE SCHEMA instead of CREATE KEYSPACE.

Example of setting the SimpleStrategy class 

Construct the CREATE KEYSPACE statement by first declaring the name of the keyspace, followed by the WITH REPLICATION keywords and the equals symbol. Next, to create a keyspace that is not optimized for multiple data centers, use SimpleStrategy for the class value in the map. Set replication_factor properties, separated by a colon and enclosed in curly brackets. For example:

CREATE KEYSPACE Excelsior
  WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 };

Using SimpleStrategy is fine for evaluating Cassandra. For production use or for use with mixed workloads, use NetworkTopologyStrategy.

Example of setting the NetworkToplogyStrategy class 

For production use or for use with mixed workloads, create the keyspace using NetworkTopologyStrategy. NetworkTopologyStrategy works as well for evaluation as SimpleStrategy and is recommended for most other purposes. NetworkTopologyStrategy must be used with mixed workloads. NetworkTopologyStrategy simplifies the transition to multiple data centers if and when required by future expansion.

Before creating a keyspace for use with multiple data centers, configure the cluster that will use the keyspace. Configure the cluster to use a network-aware snitch, such as the PropertyFileSnitch. Create a keyspace using NetworkToplogyStrategy for the class value in the map. Set one or more key-value pairs consisting of the data center name and number of replicas per data center, separated by a colon and enclosed in curly brackets. For example:

CREATE KEYSPACE "Excalibur"
  WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'dc1' : 3, 'dc2' : 2};

This example sets three replicas for a data center named dc1 and two replicas for a data center named dc2. The data center name you use depends on the cluster-configured snitch you are using. There is a correlation between the data center name defined in the map and the data center name as recognized by the snitch you are using. The nodetool status command prints out data center names and rack locations of your nodes if you are not sure what they are.

Setting DURABLE_WRITES 

You can set the DURABLE_WRITES option after the map specification of the CREATE KEYSPACE command. When set to false, data written to the keyspace bypasses the commit log. Be careful using this option because you risk losing data. Do not set this attribute on a keyspace using the SimpleStrategy.

CREATE KEYSPACE Risky
  WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy',
  'dc1' : 3 } AND DURABLE_WRITES = false;

Checking created keyspaces 

Check that the keyspaces were created:

SELECT * FROM system.schema_keyspaces;
keyspace_name  | durable_writes | strategy_class                                       | strategy_options
---------------+----------------+------------------------------------------------------+----------------------------
     excelsior |           True |          org.apache.cassandra.locator.SimpleStrategy | {"replication_factor":"3"}
     Excalibur |           True | org.apache.cassandra.locator.NetworkTopologyStrategy |      {"dc2":"2","dc1":"3"}
         risky |          False | org.apache.cassandra.locator.NetworkTopologyStrategy |                {"dc1":"1"}
        system |           True |           org.apache.cassandra.locator.LocalStrategy |                         {}
 system_traces |           True |          org.apache.cassandra.locator.SimpleStrategy | {"replication_factor":"1"}

Cassandra converted the excelsior keyspace to lowercase because quotation marks were not used to create the keyspace and retained the initial capital letter for the Excalibur because quotation marks were used.