CREATE AGGREGATE
Defines a user-defined aggregate.
An aggregate executes a user-define function (UDF) on each row in a selected data set, optionally runs a final UDF on the result set and returns a single value, for example average or standard deviation.
See also: DROP AGGREGATE
Syntax
CREATE [ OR REPLACE ] AGGREGATE [ IF NOT EXISTS ] [<keyspace_name>.]<aggregate_name> (<cql_type>) SFUNC <udf_name> STYPE <cql_type> FINALFUNC <udf_name> INITCOND <init_value> [ DETERMINISTIC ] ;
Syntax legend
Syntax conventions | Description |
---|---|
UPPERCASE |
Literal keyword. |
Lowercase |
Not literal. |
|
Variable value. Replace with a user-defined value. |
|
Optional.
Square brackets ( |
|
Group.
Parentheses ( |
|
Or.
A vertical bar ( |
|
Repeatable.
An ellipsis ( |
|
Single quotation ( |
|
Map collection.
Braces ( |
Set, list, map, or tuple.
Angle brackets ( |
|
|
End CQL statement.
A semicolon ( |
|
Separate the command line options from the command arguments with two hyphens ( |
|
Search CQL only: Single quotation marks ( |
|
Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrConfig files. |
Required parameters
- cql_type
-
Specify the CQL type input.
Restriction: Frozen collections are not supported.
- SFUNC <udf_name>
-
Specify a user-defined function. Calls the state function (SFUNC) for each row. The first parameter declared in the user-defined function is the state parameter; the function’s return value is assigned to the state parameter, which is passed to the next call. Pass multiple values using collection types, such as tuples.
- STYPE <cql_type>
-
CQL type of the parameter returned by the state function.
- FINALFUNC <udf_name>
-
User-defined function executed on the final values in the state parameter.
- INITCOND [<init_value>]
-
Define the initial condition, values, of the first parameter in the SFUNC. Set to null when no value defined.
Optional parameters
- OR REPLACE
-
Overwrites existing aggregate (with the same name). When OR REPLACE is not specified the operation fails if an aggregate with the same name already exists.
- IF NOT EXISTS
-
Creates an aggregate if it does not already exist, and displays no error if it does exist.
IF NOT EXISTS
andOR REPLACE
are not supported in the same statement. - DETERMINISTIC
-
Always returns the same output for a certain input. Requires an initial condition and returns a single value.
Default: false
(non-deterministic).
|
Examples
Create an aggregate that calculates average in the cycling keyspace and test:
This section explains how to create a function that has the same functionality as the native AVG function. |
-
Create a function with a state parameter as a tuple that counts the rows (by incrementing 1 for each record) in the first position and finds the total by adding the current row value to the existing subtotal the second position, and returns the updated state.
CREATE OR REPLACE FUNCTION cycling.average_state (state tuple<int, bigint>, val int) CALLED ON NULL INPUT RETURNS tuple<int, bigint> LANGUAGE java AS $$ if (val != null) { state.setInt(0, state.getInt(0) + 1); state.setLong(1, state.getLong(1) + val.intValue()); } return state; $$ ;
Use a simple test to verify that your function works properly.
CREATE TABLE IF NOT EXISTS cycling.test_average_state ( id int PRIMARY KEY, state frozen<tuple<int, bigint>>, val int );
The first value was incremented by one and the second value is the results of the initial state value and val.
state | cycling.avgstate(state, val) | val -------------+------------------------------+------ (6, 9949) | (7, 10000) | 51 (79, 10000) | (80, 19999) | 9999
-
Create a function that divides the total value for the selected column by the number of records.
CREATE OR REPLACE FUNCTION cycling.average_final (state tuple<int,bigint>) CALLED ON NULL INPUT RETURNS double LANGUAGE java AS $$ double r = 0; if (state.getInt(0) == 0) return null; r = state.getLong(1); r /= state.getInt(0); return Double.valueOf(r); $$ ;
-
Create the user-defined aggregate to calculate the average value in the column:
CREATE OR REPLACE AGGREGATE cycling.average (int) SFUNC average_state STYPE tuple<int,bigint> FINALFUNC average_final INITCOND (0, 0);
-
Test the function using a select statement.
SELECT * FROM cycling.team_average WHERE team_name = 'UnitedHealthCare Pro Cycling Womens Team' AND race_title = 'Amgen Tour of California Women''s Race presented by SRAM - Stage 1 - Lake Tahoe > Lake Tahoe';