CQL native aggregates

Aggregate functions work on a set of rows matching a SELECT statement to return a single value. Null values in the data set are ignored. When other columns, columns with user-defined types, or functions, are also specified in the selector clause of a SELECT statement with an aggregate function, the values in the first row matching the query are returned.

Aggregating data across partitions may cause performance issues. The recommendation is to use aggregates on one partition, to get a list of partition keys use SELECT DISTINCT partition_key to get a list of unique keys (for compound keys use a comma separated list of column names).

Synopsis

<aggregate_name>(<column_name>)

Where the system aggregate names are listed below:

AVG

AVG(<column_name>): Provides the average value of the target column using the data type of target column. Null values are ignored. Only works on numeric columns, that is tinyint, smallint, int, bigint, decimal, float, and double.

COUNT

COUNT(<column_name>): Provides the number of rows in the result set that do not have null values. For a complete row count use partition key column, since partition keys cannot contain nulls.

Works on all CQL data types columns.

MIN

MIN(<column_name>): Provides the smallest value. Null values are ignored.

Only works on numeric columns, that is tinyint, smallint, int, bigint, decimal, float, and double.

MAX

MAX(<column_name>): Provides the largest value. Null values are ignored.

Only works on numeric columns, that is tinyint, smallint, int, bigint, decimal, float, and double.

SUM

SUM(<column_name>): Provides the total of the target column; nulls are ignored.

Only works on numeric columns, that is tinyint, smallint, int, bigint, decimal, float, and double.

Examples

The following examples show how to use aggregates using the cyclist examples.

Find Average (AVG)

Get average time in seconds for the team:

SELECT AVG(cyclist_time_sec) AS Average
FROM cycling.team_average
WHERE team_name = 'UnitedHealthCare Pro Cycling Womens Team';

Results:

average
---------
  11474

(1 rows)

Find Count (COUNT)

Find the number rows for the United Health Care Pro Cycling Women’s Team:

SELECT COUNT(cyclist_name) AS Row_Count
FROM cycling.team_average
WHERE team_name = 'UnitedHealthCare Pro Cycling Womens Team';

Results:

row_count
-----------
        3

(1 rows)

Find lowest value (MIN)

Find the slowest time recorded for the United Health Care Pro Cycling Women’s Team:

SELECT MIN(cyclist_time_sec) AS Fastest
FROM cycling.team_average
WHERE team_name = 'UnitedHealthCare Pro Cycling Womens Team';

Results:

fastest
---------
  11449

(1 rows)

Find highest value (MAX)

Find the fastest time recorded for the United Health Care Pro Cycling Women’s Team:

SELECT MAX(cyclist_time_sec) AS Slowest
FROM cycling.team_average
WHERE team_name = 'UnitedHealthCare Pro Cycling Womens Team';

Results:

slowest
---------
  11490

(1 rows)

Find total (SUM)

Find the total of all times recorded for the United Health Care Pro Cycling Women’s Team:

SELECT SUM(cyclist_time_sec) AS Total_Time
FROM cycling.team_average
WHERE team_name = 'UnitedHealthCare Pro Cycling Womens Team';

Results:

total_time
------------
     34424

(1 rows)

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com