Create a table

To create a table, you must define the table name, column names, column data types, and a primary key. The cqlsh command-line tool is the primary tool for interacting with CQL to define schema.

Add the optional WITH clause and keyword arguments to configure table properties (caching, compaction, etc.) as described in the table options.

Dynamic schema generation and schema collision

It is important to note that dynamic schema generation is not supported. Schema collisions can occur if multiple clients attempt to generate tables simultaneously.

Contact IBM Support for assistance with possible schema collisions.

Prerequisites

Create a keyspace
Understand the use and importance of the primary key in table schema.
Determine the data types that each column will require.
Start cqlsh.

Keyspace and table name conventions

The name of a keyspace or a table is a string of alphanumeric characters and underscores, but it must begin with a letter. If case must be maintained, the name must be encased in double quotes, such as "MyTable".

Since tables are defined within a keyspace, you can either use the keyspace as part of the table creation command, or create a table in the current keyspace. To specify the keyspace as part of a table name, use the keyspace name, a period (.), and table name, such as cycling.cyclist-stats.

Simple table - single partition key in current keyspace

To create a table in the current keyspace, you can specify just the new table name:

CREATE TABLE cycling.cyclist_name (
  id UUID PRIMARY KEY,
  lastname text,
  firstname text
);

Result

CREATE TABLE cycling.cyclist_name (
    id uuid PRIMARY KEY,
    firstname text,
    lastname text
) WITH additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

In this example, first the cycling keyspace is selected, and then a table is created with three columns, the first of which is defined as a single PRIMARY KEY. That column, id, is a single partition key.

Simple table - single partition key in another keyspace

To create a table in another keyspace, you can specify both the keyspace and new table name:

CREATE TABLE cycling.cyclist_name (
  id UUID PRIMARY KEY,
  lastname text,
  firstname text
);

Result

CREATE TABLE cycling.cyclist_name (
    id uuid PRIMARY KEY,
    firstname text,
    lastname text
) WITH additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

The difference between the last example and this one is just that the keyspace name is used in the table creation.

Table with primary key separately specified

To define the PRIMARY KEY with one or more columns, add an additional item to the table creation:

CREATE TABLE cycling.cyclist_name (
  id UUID,
  lastname text,
  firstname text,
  PRIMARY KEY (id)
);

Result

CREATE TABLE cycling.cyclist_name (
    id uuid PRIMARY KEY,
    firstname text,
    lastname text
) WITH additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

The difference between the last example and this one is just that the PRIMARY KEY is specified separately.

Table with name of mixed case

To name a table with a mixed case, use double quotes:

CREATE TABLE cycling."addThis" (
  a int PRIMARY KEY
);

Result

CREATE TABLE cycling.addthis (
    a int PRIMARY KEY
) WITH additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

The double quotes ("…") around the table name ensure that the table is created with the mixed case name.

Unless you need keyspaces or tables with uppercase or mixed case names, DataStax recommends using lowercase only. If your keyspace or table creation commands use uppercase without double quotes, the name is normalized to lowercase.

Table - multi-column (composite) partition key

To define the PRIMARY KEY with a partition key of one or more columns:

CREATE TABLE IF NOT EXISTS cycling.rank_by_year_and_name (
  race_year int,
  race_name text,
  cyclist_name text,
  rank int,
  PRIMARY KEY ((race_year, race_name), rank)
);

Result

CREATE TABLE cycling.rank_by_year_and_name (
    race_year int,
    race_name text,
    rank int,
    cyclist_name text,
    PRIMARY KEY ((race_year, race_name), rank)
) WITH CLUSTERING ORDER BY (rank ASC)
    AND additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

In this example, race_year and race_name are the two columns that comprise the composite partition key. They are encased in parentheses, inside the PRIMARY KEY definition. The column rank is an additional column that is a clustering column, as the next section describes.

Table - clustering columns

The last example showed one example of using a clustering column. Clustering columns in a primary key are all columns listed that are not part of the partition key. The partition key, as shown above, is either the first column defined in the primary key, or a composite partition key encased in parentheses.

Here is another example to make clear the parts of a primary key:

CREATE TABLE IF NOT EXISTS cycling.cyclist_category (
  category text,
  points int,
  id UUID,
  lastname text,
  PRIMARY KEY (category, points)
)
WITH CLUSTERING ORDER BY (points DESC);

Result

CREATE TABLE cycling.cyclist_category (
    category text,
    points int,
    id uuid,
    lastname text,
    PRIMARY KEY (category, points)
) WITH CLUSTERING ORDER BY (points DESC)
    AND additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

In this example, The category column is the partition key, and the points column is a clustering column. It is important to remember that the primary key must uniquely identify a row within the cyclist_category table. More than one row with the same category can exist, as long as that row contains a different points value. All the rows with the same category will sort by the points values in descending order due to the addition of the table option WITH CLUSTER ORDER BY. Normally, columns are sorted in ascending alphabetical order.

Table with a frozen UDT

Create the race_winners table that has a frozen user-defined type (UDT):

CREATE TABLE IF NOT EXISTS cycling.race_winners (
  cyclist_name FROZEN<fullname>,
  race_name text,
  race_position int,
  PRIMARY KEY (race_name, race_position)
);

Result

CREATE TABLE cycling.race_winners (
    race_name text,
    race_position int,
    cyclist_name frozen<fullname>,
    PRIMARY KEY (race_name, race_position)
) WITH CLUSTERING ORDER BY (race_position ASC)
    AND additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

See Creating a user-defined type for information on creating UDTs. UDTs can be created unfrozen if only non-collection fields are used in the user-defined type creation. If the table is created with an unfrozen UDT, then individual field values can be updated and deleted.

Table with a geospatial type

CREATE TABLE cycling.geospatial (
  id text PRIMARY KEY,
  point 'PointType',
  linestring 'LineStringType'
);

Result

CREATE TABLE cycling.geospatial (
    id text PRIMARY KEY,
    linestring 'org.apache.cassandra.db.marshal.LineStringType',
    point 'org.apache.cassandra.db.marshal.PointType'
) WITH additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

Storing data in descending order

The following example shows a table definition that stores the categories with the highest points first.

CREATE TABLE IF NOT EXISTS cycling.cyclist_category (
  category text,
  points int,
  id UUID,
  lastname text,
  PRIMARY KEY (category, points)
)
WITH CLUSTERING ORDER BY (points DESC);

Table ID for commit log replay to restore table data

Re-create a table with its original ID to facilitate restoring table data by replaying commit logs:

CREATE TABLE IF NOT EXISTS cycling.cyclist_emails (
  userid text PRIMARY KEY,
  id UUID,
  emails set<text>
)
WITH ID = '1bb7516e-b140-11e8-96f8-529269fb1459';

Result

CREATE TABLE cycling.cyclist_emails (
    userid text PRIMARY KEY,
    emails set<text>,
    id uuid
) WITH additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

To retrieve a table’s ID, query the id column of system_schema.tables. For example:

SELECT id FROM system_schema.tables
WHERE keyspace_name = 'cycling'
  AND table_name = 'cyclist_emails';

Result

 id
--------------------------------------
 1bb7516e-b140-11e8-96f8-529269fb1459

(1 rows)

To perform a point-in-time restoration of a table, see the documentation for Astra DB.

Create a table

Dynamic schema generation and schema collision

Prerequisites

Simple table - single partition key in current keyspace

Simple table - single partition key in another keyspace

Table with primary key separately specified

Table with name of mixed case

Table - multi-column (composite) partition key

Table - clustering columns

Table with a frozen UDT

Table with a geospatial type

Storing data in descending order

Table ID for commit log replay to restore table data

See also

Was this helpful?

Give Feedback