Create a table

To create a table, you must define the table name, column names, column data types, and a primary key. The cqlsh command-line tool is the primary tool for interacting with CQL to define schema.

Add the optional WITH clause and keyword arguments to configure table properties (caching, compaction, etc.) as described in the table options.

Dynamic schema generation and schema collision

It is important to note that dynamic schema generation is not supported. Schema collisions can occur if multiple clients attempt to generate tables simultaneously. Contact Support if you believe that a schema collision is occurring for resolution.

Prerequisites

Keyspace and table name conventions:

The name of a keyspace or a table is a string of alphanumeric characters and underscores, but it must begin with a letter. If case must be maintained, the name must be encased in double quotes, such as "MyTable".

Since tables are defined within a keyspace, you can either use the keyspace as part of the table creation command, or create a table in the current keyspace. To specify the keyspace as part of a table name, use the keyspace name, a period (.), and table name, such as cycling.cyclist-stats.

Simple table - single partition key in current keyspace

To create a table in the current keyspace, you can specify just the new table name:

CREATE TABLE cycling.cyclist_name (
  id UUID PRIMARY KEY, 
  lastname text, 
  firstname text
);
Results
CREATE TABLE cycling.cyclist_name (
    id uuid PRIMARY KEY,
    firstname text,
    lastname text
) WITH additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

In this example, first the cycling keyspace is selected, and then a table is created with three columns, the first of which is defined as a single PRIMARY KEY. That column, id, is a single partition key.

Simple table - single partition key in another keyspace

To create a table in another keyspace, you can specify both the keyspace and new table name:

CREATE TABLE cycling.cyclist_name (
  id UUID PRIMARY KEY, 
  lastname text, 
  firstname text
);
Results
CREATE TABLE cycling.cyclist_name (
    id uuid PRIMARY KEY,
    firstname text,
    lastname text
) WITH additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

The difference between the last example and this one is just that the keyspace name is used in the table creation.

Table with primary key separately specified

To define the PRIMARY KEY with one or more columns, add an additional item to the table creation:

CREATE TABLE cycling.cyclist_name (
  id UUID, 
  lastname text, 
  firstname text, 
  PRIMARY KEY (id)
);
Results
CREATE TABLE cycling.cyclist_name (
    id uuid PRIMARY KEY,
    firstname text,
    lastname text
) WITH additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

The difference between the last example and this one is just that the PRIMARY KEY is specified separately.

Table with name of mixed case

To name a table with a mixed case, use double quotes:

CREATE TABLE cycling."addThis" (
  a int PRIMARY KEY
);
Results
CREATE TABLE cycling.addthis (
    a int PRIMARY KEY
) WITH additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

Note the double quotes ("…​") around the table name, to ensure the table is created with the mixed case name.

Unless you need keyspaces or tables with uppercase or mixed case names, simplify by using lowercase only. If your keyspace or table creation commands use uppercase without double quotes, the name will be created with lowercase only.

Table - multi-column (composite) partition key

To define the PRIMARY KEY with a partition key of one or more columns:

CREATE TABLE IF NOT EXISTS cycling.rank_by_year_and_name (
  race_year int,
  race_name text,
  cyclist_name text,
  rank int,
  PRIMARY KEY ((race_year, race_name), rank)
);
Results
CREATE TABLE cycling.rank_by_year_and_name (
    race_year int,
    race_name text,
    rank int,
    cyclist_name text,
    PRIMARY KEY ((race_year, race_name), rank)
) WITH CLUSTERING ORDER BY (rank ASC)
    AND additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

In this example, race_year and race_name are the two columns that comprise the composite partition key. They are encased in parentheses, inside the PRIMARY KEY definition. The column rank is an additional column that is a clustering column, as the next section describes.

Table - clustering columns

The last example showed one example of using a clustering column. Clustering columns in a primary key are all columns listed that are not part of the partition key. The partition key, as shown above, is either the first column defined in the primary key, or a composite partition key encased in parentheses.

Here is another example to make clear the parts of a primary key:

CREATE TABLE IF NOT EXISTS cycling.cyclist_category (
  category text,
  points int,
  id UUID,
  lastname text,
  PRIMARY KEY (category, points)
)
WITH CLUSTERING ORDER BY (points DESC);
Results
CREATE TABLE cycling.cyclist_category (
    category text,
    points int,
    id uuid,
    lastname text,
    PRIMARY KEY (category, points)
) WITH CLUSTERING ORDER BY (points DESC)
    AND additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

In this example, The category column is the partition key, and the points column is a clustering column. It is important to remember that the primary key must uniquely identify a row within the cyclist_category table. More than one row with the same category can exist, as long as that row contains a different points value. All the rows with the same category will sort by the points values in descending order due to the addition of the table option WITH CLUSTER ORDER BY. Normally, columns are sorted in ascending alphabetical order.

Table with a frozen UDT

Create the race_winners table that has a frozen user-defined type (UDT):

CREATE TABLE IF NOT EXISTS cycling.race_winners (
  cyclist_name FROZEN<fullname>, 
  race_name text,
  race_position int,
  PRIMARY KEY (race_name, race_position)
);
Results
CREATE TABLE cycling.race_winners (
    race_name text,
    race_position int,
    cyclist_name frozen<fullname>,
    PRIMARY KEY (race_name, race_position)
) WITH CLUSTERING ORDER BY (race_position ASC)
    AND additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

See Creating a user-defined type for information on creating UDTs. UDTs can be created unfrozen if only non-collection fields are used in the user-defined type creation. If the table is created with an unfrozen UDT, then individual field values can be updated and deleted.

Table with a geospatial type

CREATE TABLE cycling.geospatial (
  id text PRIMARY KEY,
  point 'PointType',
  linestring 'LineStringType'
);
Results
CREATE TABLE cycling.geospatial (
    id text PRIMARY KEY,
    linestring 'org.apache.cassandra.db.marshal.LineStringType',
    point 'org.apache.cassandra.db.marshal.PointType'
) WITH additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

Table with a CDC log

Create a change data capture log for the cyclist_id table:

CREATE TABLE IF NOT EXISTS cycling.cyclist_id (
  lastname text,
  firstname text,
  age int,
  id UUID,
  PRIMARY KEY ((lastname, firstname), age)
);
Results
CREATE TABLE cycling.cyclist_id (
    lastname text,
    firstname text,
    age int,
    id uuid,
    PRIMARY KEY ((lastname, firstname), age)
) WITH CLUSTERING ORDER BY (age ASC)
    AND additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

CDC logging must be enabled in cassandra.yaml.

Before enabling CDC logging, have a plan for moving and consuming the log information. After the disk space limit is reached, writes to CDC-enabled tables are rejected until more space is freed. See Change-data-capture (CDC) space settings for information about available CDC settings.

Storing data in descending order

The following example shows a table definition that stores the categories with the highest points first.

CREATE TABLE IF NOT EXISTS cycling.cyclist_category (
  category text,
  points int,
  id UUID,
  lastname text,
  PRIMARY KEY (category, points)
)
WITH CLUSTERING ORDER BY (points DESC);

Table ID for commit log replay to restore table data

Recreate a table with its original ID to facilitate restoring table data by replaying commit logs:

CREATE TABLE IF NOT EXISTS cycling.cyclist_emails (
  userid text PRIMARY KEY,
  id UUID,
  emails set<text>
)
WITH ID = '1bb7516e-b140-11e8-96f8-529269fb1459';
Results
CREATE TABLE cycling.cyclist_emails (
    userid text PRIMARY KEY,
    emails set<text>,
    id uuid
) WITH additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

To retrieve a table’s ID, query the id column of system_schema.tables. For example:

SELECT id FROM system_schema.tables
WHERE keyspace_name = 'cycling'
  AND table_name = 'cyclist_emails';
Results
 id
--------------------------------------
 1bb7516e-b140-11e8-96f8-529269fb1459

(1 rows)

To perform a point-in-time restoration of the table, see Restoring a backup to a specific point-in-time.


Related information:

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com