Create a collection

To create a collection, you must define the collection type, column name, and column data type. The cqlsh command-line tool is the primary tool for interacting with CQL to define schema.

Make sure you understand the differences in the collection types before creating a collection, as well as the difference between frozen and non-frozen collections.

Collections are not paged internally. Reading a range of values is not supported for collections.

Prerequisites

  • Keyspace must exist.

  • Understand the use and importance of the primary key in table schema.

  • Determine the type of collection you want to create.

Set collections

A set consists of a unordered group of elements with unique values. Duplicate values will not be stored distinctly. The values of a set are stored unordered, but will return the elements in sorted order when queried. Use the set data type to store data that has a many-to-one relationship with another column.

Non-frozen set

To create a non-frozen set collection in a table, specify the collection column name and column data type enclosed in angle brackets:

CREATE TABLE IF NOT EXISTS cycling.cyclist_career_teams (
  id UUID PRIMARY KEY,
  lastname text,
  teams set<text>
);
Results

The set collection is created with the column name teams, and the data type text.

Frozen set

To create a frozen set collection in a table specify the collection column name and a column data type enclosed with angle brackets, with the FROZEN keyword:

CREATE TABLE IF NOT EXISTS cycling.race_results (
  race_name text PRIMARY KEY, 
  race_history frozen<set<int>>,
  top_three frozen<map<int, text>>
);
Results

The set collection is created with the column name race_history, and the data type int.

CREATE TABLE cycling.race_results (
    race_name text PRIMARY KEY,
    race_history frozen<set<int>>,
    top_three frozen<map<int, text>>
) WITH additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

List collections

A list is similar to a set; it groups and stores values. Unlike a set, the values stored in a list do not need to be unique and can be duplicated. In addition, a list stores the elements in a particular order and may be inserted or retrieved according to an index value.

Use the list data type to store data that has a possible many-to-many relationship with another column.

Non-frozen list

To create a non-frozen list collection in a table, specify a collection column name, and a column data type enclosed in angle brackets:

CREATE TABLE IF NOT EXISTS cycling.upcoming_calendar (
  year int,
  month int,
  events list<text>,
  PRIMARY KEY (year, month)
);
Results

The list collection is created with the column name events, and the data type text.

CREATE TABLE cycling.upcoming_calendar (
    year int,
    month int,
    events list<text>,
    PRIMARY KEY (year, month)
) WITH CLUSTERING ORDER BY (month ASC)
    AND additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

Frozen list

To create a frozen list collection in a table, specify a collection column name and a column data type enclosed in angle brackets, with the FROZEN keyword:

CREATE TABLE IF NOT EXISTS cycling.race_starts (
  cyclist_name text PRIMARY KEY,
  rnumbers FROZEN<LIST<int>>
);
Results

The list collection is created with the column name rnumbers, and the data type int.

CREATE TABLE cycling.race_starts (
    cyclist_name text PRIMARY KEY,
    rnumbers frozen<list<int>>
) WITH additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

Map collections

A map relates one item to another with a key-value pair. For each key, only one value may exist, and duplicates cannot be stored. Both the key and the value are designated with a data type.

For example, using the map type, you can store timestamp-related information in user profiles. Each element of a non-frozen map is internally stored as a single column that you can modify, replace, delete, and query. Each element can have an individual time-to-live and expire when the TTL ends.

Non-frozen map

To create a non-frozen map collection in a table, specify a collection column name, and a pair of column data types enclosed in angle brackets:

CREATE TABLE IF NOT EXISTS cycling.cyclist_teams (
  id uuid PRIMARY KEY,
  firstname text,
  lastname text,
  teams map<int, text>
);
Results

The map collection is created with the column name teams, and the data types will have a key year of integer type, and a team name value of text type.

CREATE TABLE cycling.cyclist_teams (
    id uuid PRIMARY KEY,
    firstname text,
    lastname text,
    teams map<int, text>
) WITH additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

Frozen map

To create a frozen map collection in a table, specify a collection column name, and a pair of column data types enclosed in angle brackets, with the FROZEN keyword:

CREATE TABLE IF NOT EXISTS cycling.race_results (
  race_name text PRIMARY KEY, 
  race_history frozen<set<int>>,
  top_three frozen<map<int, text>>
);
Results

The map collection is created with the column name top_three, and the data types will have a key rank of integer type, and a cyclist name value of text type.

CREATE TABLE cycling.race_results (
    race_name text PRIMARY KEY,
    race_history frozen<set<int>>,
    top_three frozen<map<int, text>>
) WITH additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

Related information:

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com