Data types
Data type is declared and enforced for each column in a table.
CQL data types
String types
Wrap strings in single quotes or double dollar signs ($$
) in INSERT, UPDATE, and the SELECT statement WHERE clause.
When using single quotes, additional escaping is required for field values that contain single quotes or reserved characters, see Escaping characters. |
For example, to insert a comment into a text field that contains a single quote and emoji:
INSERT INTO cycling.comments (
id, created_at, comment
) VALUES (
e7ae5cf3-d358-4d99-b900-85902fda9bb0,
currentTimestamp(),
$$ It's pouring rain, race should have been postponed :'( $$
);
- ascii
-
US-ASCII characters.
- text
-
UTF-8 encoded string.
- varchar
-
UTF-8 encoded string.
Numeric types
For example, age = 31
.
Integers
- tinyint
-
8-bit signed integer.
- smallint
-
16-bit signed integer.
- int
-
32-bit signed integer.
- bigint
-
64-bit signed integer.
- varint
-
Arbitrary-precision integer.
Decimals
The default decimal separator is a period (.). Change the decimal separator in the driver settings or using cqlshrc decimalsep option. Internally, the decimal separator is stored as a period.
- decimal
-
Variable-precision decimal. Supports integers and floats.
When dealing with currency, it is a best practice to have a currency class that serializes to and from an int or use the decimal form.
- float
-
32-bit IEEE-754 floating point.
- double
-
64-bit IEEE-754 floating point.
Date and time types
INSERT or UPDATE date/time values using single quotes around string format or no quotes for integers.
For example, setting a date in string format purchase_date = '2017-05-12'
versus specifying it as an integer in days since epoch purchase_date = 17298
.
- date
-
32-bit unsigned integer representing the number of days since epoch (January 1, 1970) with no corresponding time value.
INSERT or UPDATE values as an integer (days since epoch) or in string format
'<yyyy>-<mm>-<dd>'
, for example'2017-05-13'
.When loading data from CSV use the datetimeformat option in a cqlshrc file to change the cqlsh COPY TO date parsing format.
- DateRangeType
-
Stores a date range. Truncated timestamps represent the entire date span or use the range syntax to create a custom range.
[<beginning_date> TO <end_date>]
Insert the custom ranges between square brackets. For example:
-
2018-01
— Beginning of the first day to the end of the last day in January 2018. -
2018-01T15
— Range includes hours of the day. 1500 to before 1600 (3pm to 4pm). -
[2017-01-15 TO 2017-11-01]
— The start of the fifteenth of January through the end of the first day of November. -
[2017 TO 2017-11-01]
— Start of 2017 until the end of the first day of November. -
[* TO 2018-01-31]
— From the earliest representable time through to the end of the day on 2018-01-31. If you specify a date instance using a date function, likecurrentDate()
, you get the first millisecond of that day, not the entire day’s range.
-
The data type name is case sensitive. Use single quotes to specify this type in CQL statement. |
- duration
-
Encoded as these 3 signed integers of variable lengths, where the integers represent the number of:
-
Months
-
Days
-
Nanoseconds
The number of months and days are decoded as 32-bit integers. The number of nanoseconds is decoded as a 64-bit integer. Provide the duration value using one of these formats:
-
Duration format: <N>y<N><N>mo<N>w<N>d<N><N>h<N>s<N>ms<N>us<N>ns. For example,
12h30m
. The units are:-
y
- years (12 months) -
mo
- months (1 month) -
w
- weeks (7 days) -
d
- days (1 day) -
h
- hours (3,600,000,000,000 nanoseconds) -
m
- minutes (60,000,000,000 nanoseconds) -
s
- seconds (1,000,000,000 nanoseconds) -
ms
- milliseconds (1,000,000 nanoseconds) -
us
orµs
- microseconds (1000 nanoseconds) -
ns
- nanoseconds (1 nanosecond)
-
-
ISO 8601 format: P[n]Y[n]M[n]DT[n]H[n]M[n]S or P[n]W
-
ISO 8601 alternative format: P[YYYY]-[MM]-[DD]T[hh]:[mm]:[ss] Restriction: The
PRIMARY KEY
does not support duration type because it is not possible to determine if1mo
is greater than29d
without a date context.
-
- time
-
Encoded 64-bit signed integers representing the number of nanoseconds since midnight with no corresponding date value.
- timestamp
-
64-bit signed integer representing the date and time since epoch (January 1 1970 at 00:00:00 GMT) in milliseconds.
INSERT or UPDATE string format is ISO-8601; the string must contain the date and optionally can include the time and time zone,
'<yyyy>-<mm>-<dd> [<hh>:<MM>:<ss>[.<fff>]][+/-<NNNN>]'
where <NNNN> is the RFC 822 4-digit time zone specification (+0000 refers to GMT and US PST is -0800). If no time zone is specified, the client timezone is assumed. For example'2015-05-03 13:30:54.234-0800'
,'2015-05-03 13:30:54+0400'
, or'2015-05-03'
.
Unique identifiers
- uuid
-
128-bit universally unique identifier (UUID). Generate with the uuid function.
Specialized types
- blob
-
Arbitrary bytes (no validation), expressed as hexadecimal. See Blob conversion functions.
- counter
-
64-bit signed integer. Only one counter column is allowed per table. All other columns in a counter table must be PRIMARY KEYs. Increment and decrement the counter with an UPDATE statement using the
+
and-
operators. Null values are not supported in the counter column, the initial count equals 0.
The data type name is case sensitive. Use single quotes to specify this type in CQL statement. |
- inet
-
IP address string in IPv4 or IPv6 format.
Geospatial types
- PointType
-
Contains two coordinate values for latitude and longitude. See Creating a table with a geospatial type for details on entering point information.
The data type name is case sensitive. Use single quotes to specify this type in CQL statement. |
- LineStringType
-
Comma separate list of points. See Creating a table with a geospatial type for details on entering linestring information.
The data type name is case sensitive. Use single quotes to specify this type in CQL statement. |
- PolygonType
-
Set of two linestrings.
The data type name is case sensitive. Use single quotes to specify this type in CQL statement. |
Collection types
CQL supports storing multiple values in a single column. Use collections to store or denormalize small amounts of data, such as phone numbers, tags, or addresses. Collections are not appropriate for data that is expected to grow unbounded, such as all events for a particular user; instead use a table with clustering columns.
Non-frozen collections have the following characteristics and limitations:
-
Because collections are not indexed or paged internally, the entire collection is read in order to access a single element.
-
Some operations on lists incur a read-before-write. Also list operations are not idempotent by nature and can cause timeout issues in the case of retries. INSERT on sets and maps never incur a read-before-write internally, therefore DataStax recommends sets over lists whenever possible.
Restriction: Storing a large amount of data in a single collection is an anti-pattern and therefore not supported.
- frozen
-
Use frozen on a set, map, or list to serialize multiple components into a single value,
frozen<<collection_definition>>
. Non-frozen types allow updates to individual fields, but values in a frozen collection are treated like blobs, any upsert overwrites the entire value.
- list
-
Comma separated list of non-unique values of the same data type,
list[data_type]
. Elements are ordered by their position in the list; the first position is zero. Supports appending and prepending elements in INSERT and UPDATE statements using the + and - operators.Lists have limitations and performance impact, whenever possible use a set or a frozen list, for example
frozen<list<int>>
. The append and prepend operations are not idempotent. If either of these operations timeout, the retry operation may (or may not) result in appending or prepending the value twice.
- map
-
Set of key-value pairs, where keys are unique and the map is sorted by its keys,
map<<data_type>[, <data_type>, ... ]>
.
- tuple
-
Fixed length set of elements of different types. Unlike other collection types, a tuple is always frozen (without the need of the frozen keyword). The entire field is overwritten when using INSERT and UPDATE, therefore the expressions must provide a value for each element; explicitly declare null for elements that have no value. Tuples can contain tuples, for example
tuple<int,tuple<text,text>,boolean>
and also be specified as a data type of another collection type, for exampleset<tuple<text,inet>
.
- user defined type (UDT)
-
Customized collection type that belongs to a specific keyspace. The UDT is only available in the keyspace where it is created. The
system_schema.types
contains a list of all UDT, thekeyspace_name
,type_name
,field_names
, andfield_types
.
Vector type
CQL supports storing multiple values in a single column in a vector array with a limited data type for the array values. Use vectors for storing AI embeddings, with values of float32.
Deprecated types
The following types are supported for backward compatibility only.
- custom type
-
Deprecated supported for backward compatibility. Customized type added as a sub-class to
AbstractType
, where the class name is fully qualified or relative to theorg.apache.cassandra.db.marshal
package.Replaced by user defined type (UDT).
- Date, time, and timestamp format
-
Describes string format used to upsert data into columns.