Find distinct values
This Astra DB Serverless feature is currently in public preview. Development is ongoing, and the features and functionality are subject to change. Astra DB Serverless, and the use of such, is subject to the DataStax Preview Terms. The Data API tables commands are available through HTTP and the clients. If you use a client, tables commands are available only in client versions 2.0-preview or later. For more information, see Data API client upgrade guide. |
Finds the distinct values of a column for rows in a table.
For general information about working with tables and rows, see About tables.
This method finds all rows that match the filter, or all rows if no filter is applied. There can be performance, latency, and billing implications if there are many matching rows. |
Method signature
-
Python
-
TypeScript
-
Java
-
curl
The following method belongs to the astrapy.Table
class.
distinct(
key: str,
*,
filter: Dict[str, Any],
general_method_timeout_ms: int,
request_timeout_ms: int,
timeout_ms: int,
) -> list[Any]
The TypeScript client does not support this method. Instead, you can use Find rows, and then extract the desired values from the response.
The following methods belong to the com.datastax.astra.client.tables.Table
class.
<R> List<R> distinct(
String fieldName,
Filter filter,
Class<R> resultClass
)
<R> List<R> distinct(
String fieldName,
Filter filter,
Class<R> resultClass,
TableDistinctOptions options
)
This method has no literal equivalent in HTTP. Instead, you can use Find rows, and then extract the desired values from the response.
Result
-
Python
-
TypeScript
-
Java
-
curl
Returns a list of the distinct values of the specified key.
Example response:
['home_appliance', None, 'sports_equipment', {'cat_id': 54, 'cat_name': 'gardening_gear'}]
The TypeScript client does not support this method. Instead, you can use Find rows, and then extract the desired values from the response.
Returns a list of the distinct values of the specified key.
This method has no literal equivalent in HTTP. Instead, you can use Find rows, and then extract the desired values from the response.
Parameters
-
Python
-
TypeScript
-
Java
-
curl
Name | Type | Summary |
---|---|---|
|
|
The name of the field whose value is inspected across rows.
Keys are typically just column names, although they can use dot notation to select particular entries in |
|
|
A dictionary expressing which condition the inspected rows must satisfy. You can use filter operators to compare columns with literal values. For example:
|
|
|
A timeout, in milliseconds, to impose on the underlying API request.
If not provided, the |
|
|
A timeout, in milliseconds, for each API request performed while scrolling through matches (see |
The TypeScript client does not support this method. Instead, you can use Find rows, and then extract the desired values from the response.
Name | Type | Summary |
---|---|---|
|
|
The name of the field on which perform the distinct |
|
A filter expressing which condition the inspected rows must satisfy.
You can use filter operators to compare columns with literal values.
Filters can be instantiated with its constructor and specialized with method |
|
|
|
Represents the expected type of the column returned.
For example, if the target column is |
|
A wrapper for the different options and specialization of this search. |
This method has no literal equivalent in HTTP. Instead, you can use Find rows, and then extract the desired values from the response.
Examples
The following examples demonstrate how to find distinct values of a column for rows in a table.
-
Python
-
TypeScript
-
Java
-
curl
Use a specified filter condition to list distinct values for a column:
my_table.distinct("winner", filter={"match_id": "challenge6"})
Example:
Full example script
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
from astrapy.constants import SortMode
from astrapy.info import (
CreateTableDefinition,
ColumnType,
)
my_table = database.create_table(
"games",
definition=(
CreateTableDefinition.builder()
.add_column("match_id", ColumnType.TEXT)
.add_column("round", ColumnType.TINYINT)
.add_vector_column("m_vector", dimension=3)
.add_column("score", ColumnType.INT)
.add_column("when", ColumnType.TIMESTAMP)
.add_column("winner", ColumnType.TEXT)
.add_set_column("fighters", ColumnType.UUID)
.add_partition_by(["match_id"])
.add_partition_sort({"round": SortMode.ASCENDING})
.build()
),
)
from astrapy.data_types import (
DataAPISet,
DataAPITimestamp,
DataAPIVector,
)
from astrapy.ids import UUID
insert_result = my_table.insert_many(
[
{
"match_id": "fight4",
"round": 1,
"winner": "Victor",
"score": 18,
"when": DataAPITimestamp.from_string(
"2024-11-28T11:30:00Z",
),
"fighters": DataAPISet([
UUID("0193539a-2770-8c09-a32a-111111111111"),
UUID('019353e3-00b4-83f9-a127-222222222222'),
]),
"m_vector": DataAPIVector([0.4, -0.6, 0.2]),
},
{
"match_id": "challenge6",
"round": 1,
"winner": "Donna",
"m_vector": [0.9, -0.1, -0.3],
},
{"match_id": "challenge6", "round": 2, "winner": "Erick"},
{"match_id": "challenge6", "round": 3, "winner": "Fiona"},
],
)
# distinct within a partition:
my_table.distinct("winner", filter={"match_id": "challenge6"})
# ['Donna', 'Erick', 'Fiona']
# distinct values across the whole table:
# (not recommended performance-wise)
my_table.distinct("winner")
# The Data API returned a warning: {'errorCode': 'ZERO_FILTER_OPERATIONS', ...
# ['Victor', 'Adam Zuul', 'Betta Vigo', 'Caio Gozer', 'Donna', 'Erick', ...
# Over a column containing null values
# (also with composite filter):
my_table.distinct(
"score",
filter={"match_id": {"$in": ["fight4", "tournamentA"]}},
)
# [18, None]
# distinct over a set column (automatically "unrolled"):
my_table.distinct(
"fighters",
filter={"match_id": {"$in": ["fight4", "tournamentA"]}},
)
# [UUID('0193539a-2770-8c09-a32a-111111111111'), UUID('019353e3-00b4-...
# distinct within a partition:
my_table.distinct("winner", filter={"match_id": "challenge6"})
# ['Donna', 'Erick', 'Fiona']
# distinct values across the whole table:
# (not recommended performance-wise)
my_table.distinct("winner")
# The Data API returned a warning: {'errorCode': 'ZERO_FILTER_OPERATIONS', ...
# ['Victor', 'Adam Zuul', 'Betta Vigo', 'Caio Gozer', 'Donna', 'Erick', ...
# Over a column containing null values
# (also with composite filter):
my_table.distinct(
"score",
filter={"match_id": {"$in": ["fight4", "tournamentA"]}},
)
# [18, None]
# distinct over a set column (automatically "unrolled"):
my_table.distinct(
"fighters",
filter={"match_id": {"$in": ["fight4", "tournamentA"]}},
)
# [UUID('0193539a-2770-8c09-a32a-111111111111'), UUID('019353e3-00b4-...
The TypeScript client does not support this method. Instead, you can use Find rows, and then extract the desired values from the response.
Use a filter condition to find rows and then list distinct values in a given column:
// Find with no options
Filter filter = Filters.eq("match_id", "challenge6");
// Show you the distinct match id in the table
List<String> winners = table
.distinct("winner", filter, String.class);
// Adding an Options
TableDistinctOptions options = new TableDistinctOptions()
.timeout(1000);
List<String> winners2 = table
.distinct("winner", filter, String.class, options);
Example:
package com.datastax.astra.client.tables;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.DataAPIClients;
import com.datastax.astra.client.core.query.Filter;
import com.datastax.astra.client.core.query.Filters;
import com.datastax.astra.client.core.query.Sort;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.commands.options.TableDistinctOptions;
import com.datastax.astra.client.tables.commands.options.TableFindOptions;
import com.datastax.astra.client.tables.cursor.TableCursor;
import com.datastax.astra.client.tables.definition.rows.Row;
import java.util.List;
import static com.datastax.astra.client.core.query.Filters.eq;
import static com.datastax.astra.client.core.query.Projection.include;
public class Distinct {
public static void main(String[] args) {
Database db = new DataAPIClient("token").getDatabase("endpoint");
Table<Row> table = db.getTable("games");
// Show you all match id in the table
List<Row> matches = table.find(null, new TableFindOptions()
.projection(include("match_id"))).toList();
matches.forEach(System.out::println);
Filter filter = Filters.eq("match_id", "challenge6");
TableDistinctOptions options = new TableDistinctOptions()
.timeout(1000);
// Show you the distinct match id in the table
table.distinct("match_id", filter, String.class)
.forEach(System.out::println);
}
}
This method has no literal equivalent in HTTP. Instead, you can use Find rows, and then extract the desired values from the response.
Client reference
-
Python
-
TypeScript
-
Java
-
curl
For more information, see the client reference.
The TypeScript client does not support this method. Instead, you can use Find rows, and then extract the desired values from the response.
For more information, see the client reference.
Client reference documentation is not applicable for HTTP.