Selectors
The SELECT
statement is used to query data from a table in CQL.
The results are returned as a set of rows.
Selectors are used to specify which columns are queried and returned.
The SELECT
statement can also transform the data on return using functions.
Table data also has related metadata that can be returned using metadata functions.
Return all columns
To return all columns from a table, use the wildcard *
selector.
SELECT * FROM cycling.cyclist_name;
Results
id | firstname | lastname
--------------------------------------+-----------+-----------------
e7ae5cf3-d358-4d99-b900-85902fda9bb0 | Alex | FRAME
fb372533-eb95-4bb4-8685-6ef61e994caa | Michael | MATTHEWS
5b6962dd-3f90-4c93-8f61-eabfa4a803e2 | Marianne | VOS
220844bf-4860-49d6-9a4b-6b5d3a79cbfb | Paolo | TIRALONGO
6ab09bec-e68e-48d9-a5f8-97e6fb4c9b47 | Steven | KRUIKSWIJK
e7cd5752-bc0d-4157-a80f-7523add8dbcd | Anna | VAN DER BREGGEN
(6 rows)
id | firstname | lastname
--------------------------------------+-----------+-----------------
e7ae5cf3-d358-4d99-b900-85902fda9bb0 | Alex | FRAME
fb372533-eb95-4bb4-8685-6ef61e994caa | Michael | MATTHEWS
5b6962dd-3f90-4c93-8f61-eabfa4a803e2 | Marianne | VOS
220844bf-4860-49d6-9a4b-6b5d3a79cbfb | Paolo | TIRALONGO
6ab09bec-e68e-48d9-a5f8-97e6fb4c9b47 | Steven | KRUIKSWIJK
e7cd5752-bc0d-4157-a80f-7523add8dbcd | Anna | VAN DER BREGGEN
(6 rows)
Since the *
selector returns all columns, it is not necessary to specify the column names.
The more columns a table has, the more resources are required to return the data.
Return specific columns
Because the number of columns returned can affect performance, it is recommended to specify the columns to return, if possible.
For example, to return only the lastname
column from the cycling.cyclist_name
table:
SELECT lastname FROM cycling.cyclist_name;
Results
lastname
-----------------
FRAME
MATTHEWS
VOS
TIRALONGO
KRUIKSWIJK
VAN DER BREGGEN
(6 rows)
lastname
-----------------
FRAME
MATTHEWS
VOS
TIRALONGO
KRUIKSWIJK
VAN DER BREGGEN
(6 rows)
To return multiple columns, separate the column names with a comma:
SELECT lastname, firstname FROM cycling.cyclist_name;
Results
lastname | firstname
-----------------+-----------
FRAME | Alex
MATTHEWS | Michael
VOS | Marianne
TIRALONGO | Paolo
KRUIKSWIJK | Steven
VAN DER BREGGEN | Anna
(6 rows)
lastname | firstname
-----------------+-----------
FRAME | Alex
MATTHEWS | Michael
VOS | Marianne
TIRALONGO | Paolo
KRUIKSWIJK | Steven
VAN DER BREGGEN | Anna
(6 rows)
Columns can be renamed on return using an alias:
SELECT id, cyclist_age AS age FROM cycling.cyclist_alt_stats LIMIT 3;
Results
id | age
--------------------------------------+------
e0953617-07eb-4c82-8f91-3b2757981625 | null
1ba0417d-62da-4103-b710-de6fb222db6f | 34
4ceb495c-55ab-4f71-83b9-81117252bf13 | 34
(3 rows)
id | age
--------------------------------------+------
e0953617-07eb-4c82-8f91-3b2757981625 | null
1ba0417d-62da-4103-b710-de6fb222db6f | 34
4ceb495c-55ab-4f71-83b9-81117252bf13 | 34
(3 rows)
A column alias cannot be used in the |
Return transformed data
The SELECT
statement can transform the data on return using functions.
The functions are varied and can be used to manipulate the data in different ways.
For example, JSON
can be used to return all columns as JSON, toJson
to return selected column as JSON, or CAST
can be used to change the data type of a column.
Distinct column values can be returned using the DISTINCT
function.
The number of rows with a particular column can be returned using the COUNT
function.
Any native or user-defined functions can be used to transform the data, such as the SUM
or AVG
function.
Any user-defined aggregate function can be used to transform the data, such as the average
function.
The TOKEN
function can be used to return the token value of a column.
To output selected data from a table in JSON format, use the JSON
function:
SELECT JSON * FROM cycling.birthday_list;
Results
[json]
----------------------------------------------------------------------------------------------------------
{"cyclist_name": "Claudio HEINEN", "blist": {"age": "23", "bday": "27/07/1992", "nation": "GERMANY"}}
{"cyclist_name": "Claudio VANDELLI", "blist": {"age": "54", "bday": "27/07/1961", "nation": "ITALY"}}
{"cyclist_name": "Luc HAGENAARS", "blist": {"age": "28", "bday": "27/07/1987", "nation": "NETHERLANDS"}}
{"cyclist_name": "Toine POELS", "blist": {"age": "52", "bday": "27/07/1963", "nation": "NETHERLANDS"}}
{"cyclist_name": "Allan DAVIS", "blist": {"age": "35", "bday": "27/07/1980", "nation": "AUSTRALIA"}}
{"cyclist_name": "Laurence BOURQUE", "blist": {"age": "23", "bday": "27/07/1992", "nation": "CANADA"}}
(6 rows)
Note the nested JSON structure in the output, with the column name as the key and the column value as the value.
To specify the JSON format for a particular column, enclose its name in the toJson()
function:
SELECT id, commenter, toJson(created_at) FROM cycling.comments_vs;
Results
id | commenter | system.tojson(created_at)
----+-----------+---------------------------
(0 rows)
id | commenter | system.tojson(created_at)
----+-----------+---------------------------
(0 rows)
Note that the JSON-formatted timestamp
column can include complete time zone information.
To cast a column to a different data type, use the CAST
function:
SELECT CAST(created_at AS date) FROM cycling.comments;
Results
cast(created_at as date)
--------------------------
2023-04-01
2024-06-07
(2 rows)
If a table has duplicate values in a column, use the DISTINCT
function to return only the unique values:
SELECT DISTINCT country FROM cycling.country_flag;
Results
country
---------
Belgium
France
(2 rows)
country
---------
Belgium
France
(2 rows)
You may want to count the number of rows in a column.
Use the COUNT
function to return the count:
SELECT start_month, MAX(start_day), COUNT(*) FROM cycling.events
WHERE year = 2017 AND discipline = 'Cyclo-cross';
Results
start_month | system.max(start_day) | count
-------------+-----------------------+-------
null | null | 0
(1 rows)
start_month | system.max(start_day) | count
-------------+-----------------------+-------
1 | 28 | 11
(1 rows)
To use the COUNT
function with a WHERE
clause, include the COUNT
function in the SELECT
list and the WHERE
clause in the query.
For example, to get the number of cyclists from Belgium:
SELECT count(cyclist_name) FROM cycling.country_flag
WHERE country='Belgium';
Results
system.count(cyclist_name)
----------------------------
2
(1 rows)
system.count(cyclist_name)
----------------------------
2
(1 rows)
A number of native functions can be used to transform the data.
The data is transformed only on return, rather than stored in the table.
These are general functions that are familiar to most people, such as the average, or AVG
function:
SELECT AVG(race_points) FROM cycling.cyclist_points
WHERE id = e3b19ec4-774a-4d1c-9e5a-decec1e30aac;
Results
system.avg(race_points)
-------------------------
67
(1 rows)
system.avg(race_points)
-------------------------
67
(1 rows)
User-defined functions can also be used to transform the data. For examples, the 'left` function can be used to return a column that is left-justified:
SELECT left(firstname, 1), lastname FROM cycling.cyclist_name;
Results
User-defined aggregates can also be used to transform the data.
For examples, the 'average` function can be used to retrieve the average of the column cyclist_time_sec
from a table:
SELECT average(cyclist_time_sec) AS Average FROM cycling.team_average
WHERE team_name = 'UnitedHealthCare Pro Cycling Womens Team';
Results
The TOKEN
function can be used to return the token value of a column.
The token value is the hashed value of the partition key, thus it can only be used on the partition key column or columns.
This function is useful when using the token value in a query to filter data with a WHERE
clause.
To return the token value of the race_id
column in the cycling.race
table:
SELECT race_name AS name,race_year AS year FROM cycling.rank_by_year_and_name
WHERE TOKEN(race_year,race_name) >= 4582455970709790046;
Results
name | year
--------------------------------------------+------
Tour of Japan - Stage 4 - Minami > Shinshu | 2015
Tour of Japan - Stage 4 - Minami > Shinshu | 2015
Tour of Japan - Stage 4 - Minami > Shinshu | 2015
(3 rows)
name | year
--------------------------------------------+------
Tour of Japan - Stage 4 - Minami > Shinshu | 2015
Tour of Japan - Stage 4 - Minami > Shinshu | 2015
Tour of Japan - Stage 4 - Minami > Shinshu | 2015
(3 rows)
Return column metadata
Each non-primary key column in a table has metadata associated with it.
The WRITETIME
function returns the timestamp of the last write to a column.
The TTL
function returns the time to live of a column.
It is important to note that the metadata functions are not available for all columns.
For example, the metadata functions are not available for columns with a data type of counter
or primary key columns.
Additionally, the metadata functions are not available for columns that are part of a collection or a user-defined type that is not frozen.
To return the writetime of the firstname
column in the cycling.cyclist_points
table:
SELECT WRITETIME (firstname) FROM cycling.cyclist_points
WHERE id = e3b19ec4-774a-4d1c-9e5a-decec1e30aac;
Results
writetime(firstname)
----------------------
1718927710224973
1718927710227830
1718927710231004
(3 rows)
writetime(firstname)
----------------------
1719804935592277
1719804935627989
1719804935630998
(3 rows)
The TTL
function returns the time to live of a column.
This function is useful when using Time to Live (TTL) to expire data in a table.
If a TTL
is set on a column, the data is automatically deleted after the specified time has elapsed.
To return the TTL of the race_points
column in the cycling.cyclist_points
table:
SELECT TTL(race_name) FROM cycling.calendar WHERE race_id = 200;
Results
ttl(race_name)
----------------
199
(1 rows)
ttl(race_name)
----------------
200
(1 rows)