Selectors

The SELECT statement is used to query data from a table in CQL. The results are returned as a set of rows. Selectors are used to specify which columns are queried and returned. The SELECT statement can also transform the data on return using functions. Table data also has related metadata that can be returned using metadata functions.

Return all columns

To return all columns from a table, use the wildcard * selector.

SELECT * FROM cycling.cyclist_name;
Results
 id                                   | firstname | lastname
--------------------------------------+-----------+-----------------
 e7ae5cf3-d358-4d99-b900-85902fda9bb0 |      Alex |           FRAME
 fb372533-eb95-4bb4-8685-6ef61e994caa |   Michael |        MATTHEWS
 5b6962dd-3f90-4c93-8f61-eabfa4a803e2 |  Marianne |             VOS
 220844bf-4860-49d6-9a4b-6b5d3a79cbfb |     Paolo |       TIRALONGO
 6ab09bec-e68e-48d9-a5f8-97e6fb4c9b47 |    Steven |      KRUIKSWIJK
 e7cd5752-bc0d-4157-a80f-7523add8dbcd |      Anna | VAN DER BREGGEN

(6 rows)

Since the * selector returns all columns, it is not necessary to specify the column names. The more columns a table has, the more resources are required to return the data.

Return specific columns

Because the number of columns returned can affect performance, it is recommended to specify the columns to return, if possible.

For example, to return only the lastname column from the cycling.cyclist_name table:

SELECT lastname FROM cycling.cyclist_name;
Results
 lastname
-----------------
           FRAME
        MATTHEWS
             VOS
       TIRALONGO
      KRUIKSWIJK
 VAN DER BREGGEN

(6 rows)

To return multiple columns, separate the column names with a comma:

SELECT lastname, firstname FROM cycling.cyclist_name;
Results
 lastname        | firstname
-----------------+-----------
           FRAME |      Alex
        MATTHEWS |   Michael
             VOS |  Marianne
       TIRALONGO |     Paolo
      KRUIKSWIJK |    Steven
 VAN DER BREGGEN |      Anna

(6 rows)

Columns can be renamed on return using an alias:

SELECT id, cyclist_age AS age FROM cycling.cyclist_alt_stats LIMIT 3;
Results
 id                                   | age
--------------------------------------+------
 e0953617-07eb-4c82-8f91-3b2757981625 | null
 1ba0417d-62da-4103-b710-de6fb222db6f |   34
 4ceb495c-55ab-4f71-83b9-81117252bf13 |   34

(3 rows)

A column alias cannot be used in the WHERE clause; use the original column name.

Return transformed data

The SELECT statement can transform the data on return using functions. The functions are varied and can be used to manipulate the data in different ways. For example, JSON can be used to return all columns as JSON, toJson to return selected column as JSON, or CAST can be used to change the data type of a column. Distinct column values can be returned using the DISTINCT function. The number of rows with a particular column can be returned using the COUNT function. Any native or user-defined functions can be used to transform the data, such as the SUM or AVG function. Any user-defined aggregate function can be used to transform the data, such as the average function. The TOKEN function can be used to return the token value of a column.

To output selected data from a table in JSON format, use the JSON function:

SELECT JSON * FROM cycling.birthday_list;
Results
 [json]
----------------------------------------------------------------------------------------------------------
    {"cyclist_name": "Claudio HEINEN", "blist": {"age": "23", "bday": "27/07/1992", "nation": "GERMANY"}}
    {"cyclist_name": "Claudio VANDELLI", "blist": {"age": "54", "bday": "27/07/1961", "nation": "ITALY"}}
 {"cyclist_name": "Luc HAGENAARS", "blist": {"age": "28", "bday": "27/07/1987", "nation": "NETHERLANDS"}}
   {"cyclist_name": "Toine POELS", "blist": {"age": "52", "bday": "27/07/1963", "nation": "NETHERLANDS"}}
     {"cyclist_name": "Allan DAVIS", "blist": {"age": "35", "bday": "27/07/1980", "nation": "AUSTRALIA"}}
   {"cyclist_name": "Laurence BOURQUE", "blist": {"age": "23", "bday": "27/07/1992", "nation": "CANADA"}}

(6 rows)

Note the nested JSON structure in the output, with the column name as the key and the column value as the value.

To specify the JSON format for a particular column, enclose its name in the toJson() function:

SELECT id, commenter, toJson(created_at) FROM cycling.comments_vs;
Results
 id                                   | commenter | system.tojson(created_at)
--------------------------------------+-----------+----------------------------
 e8ae5cf3-d358-4d99-b900-85902fda9bb0 |      John | "2017-04-01 14:33:02.160Z"
 e7ae5cf3-d358-4d99-b900-85902fda9bb0 |      Alex | "2017-04-01 14:33:02.160Z"
 e7ae5cf3-d358-4d99-b900-85902fda9bb0 |      Alex | "2017-03-21 21:11:09.999Z"
 e7ae5cf3-d358-4d99-b900-85902fda9bb0 |      Alex | "2017-02-14 20:43:20.000Z"
 c7fceba0-c141-4207-9494-a29f9809de6f |       Amy | "2024-07-02 22:39:30.758Z"
 c7fceba0-c141-4207-9494-a29f9809de6f |       Amy | "2017-04-01 13:43:08.030Z"
 c7fceba0-c141-4207-9494-a29f9809de6f |       Amy | "2017-03-22 01:16:59.001Z"
 c7fceba0-c141-4207-9494-a29f9809de6f |       Amy | "2017-02-17 08:43:20.234Z"
 c7fceba0-c141-3207-9494-a29f9809de6f |       Amy | "2017-02-17 08:43:20.234Z"
 c7fceba0-c141-7207-9494-a29f9809de6f |       Amy | "2024-07-02 22:39:30.763Z"
 e8ae5df3-d358-4d99-b900-85902fda9bb0 |      Jane | "2017-04-01 14:33:02.160Z"
 c7fceba0-c141-4207-9594-a29f9809de6f |      Jane | "2017-02-17 08:43:20.234Z"

(12 rows)

Note that the JSON-formatted timestamp column can include complete time zone information.

To cast a column to a different data type, use the CAST function:

  SELECT CAST(created_at AS date) FROM cycling.comments;
Results
 cast(created_at as date)
--------------------------
               2023-04-01
               2024-06-07

(2 rows)

If a table has duplicate values in a column, use the DISTINCT function to return only the unique values:

SELECT DISTINCT country FROM cycling.country_flag;
Results
 country
---------
 Belgium
  France

(2 rows)

You may want to count the number of rows in a column. Use the COUNT function to return the count:

SELECT start_month, MAX(start_day), COUNT(*) FROM cycling.events
  WHERE year = 2017 AND discipline = 'Cyclo-cross';
Results
 start_month | system.max(start_day) | count
-------------+-----------------------+-------
           1 |                    28 |    11

(1 rows)

To use the COUNT function with a WHERE clause, include the COUNT function in the SELECT list and the WHERE clause in the query. For example, to get the number of cyclists from Belgium:

SELECT count(cyclist_name) FROM cycling.country_flag 
 WHERE country='Belgium';
Results
 system.count(cyclist_name)
----------------------------
                          2

(1 rows)

A number of native functions can be used to transform the data. The data is transformed only on return, rather than stored in the table. These are general functions that are familiar to most people, such as the average, or AVG function:

SELECT AVG(race_points) FROM cycling.cyclist_points 
  WHERE id = e3b19ec4-774a-4d1c-9e5a-decec1e30aac;
Results
 system.avg(race_points)
-------------------------
                      67

(1 rows)

User-defined functions can also be used to transform the data. For examples, the 'left` function can be used to return a column that is left-justified:

SELECT left(firstname, 1), lastname FROM cycling.cyclist_name;
Results

User-defined aggregates can also be used to transform the data. For examples, the 'average` function can be used to retrieve the average of the column cyclist_time_sec from a table:

SELECT average(cyclist_time_sec) AS Average FROM cycling.team_average
  WHERE team_name = 'UnitedHealthCare Pro Cycling Womens Team';
Results
 average
-------------
 11474.66667

(1 rows)

Warnings :
User defined function cycling.average_state : (tuple<int, bigint>, int) -> tuple<int, bigint> consumed more than 500µs CPU time (2120µs)

The TOKEN function can be used to return the token value of a column. The token value is the hashed value of the partition key, thus it can only be used on the partition key column or columns. This function is useful when using the token value in a query to filter data with a WHERE clause.

To return the token value of the race_id column in the cycling.race table:

SELECT race_name AS name,race_year AS year FROM cycling.rank_by_year_and_name
  WHERE TOKEN(race_year,race_name) >= 4582455970709790046;
Results
 name                                       | year
--------------------------------------------+------
 Tour of Japan - Stage 4 - Minami > Shinshu | 2015
 Tour of Japan - Stage 4 - Minami > Shinshu | 2015
 Tour of Japan - Stage 4 - Minami > Shinshu | 2015

(3 rows)

Return column metadata

Each non-primary key column in a table has metadata associated with it. The WRITETIME function returns the timestamp of the last write to a column. The TTL function returns the time to live of a column.

It is important to note that the metadata functions are not available for all columns. For example, the metadata functions are not available for columns with a data type of counter or primary key columns. Additionally, the metadata functions are not available for columns that are part of a collection or a user-defined type that is not frozen.

To return the writetime of the firstname column in the cycling.cyclist_points table:

SELECT WRITETIME (firstname) FROM cycling.cyclist_points
  WHERE id = e3b19ec4-774a-4d1c-9e5a-decec1e30aac;
Results
 writetime(firstname)
----------------------
     1719960338817118
     1719960338823276
     1719960338827071

(3 rows)

The TTL function returns the time to live of a column. This function is useful when using Time to Live (TTL) to expire data in a table. If a TTL is set on a column, the data is automatically deleted after the specified time has elapsed.

To return the TTL of the race_points column in the cycling.cyclist_points table:

SELECT TTL(race_name) FROM cycling.calendar WHERE race_id = 200;
Results
 ttl(race_name)
----------------
            199

(1 rows)

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com