Find distinct values
Documents represent a single row or record of data in Astra DB Serverless databases.
You use the Collection
class to work with documents through the Data API clients.
For instructions to get a Collection
object, see Work with collections.
For general information about working with documents, including common operations and operators, see the Work with documents.
For more information about the Data API and clients, see Get started with the Data API.
Find distinct values across documents
Get a list of the distinct values of a certain key in a collection.
|
Sort and filter clauses can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in sort or filter queries. |
-
Python
-
TypeScript
-
Java
-
curl
For more information, see the Client reference.
collection.distinct("category")
Get the distinct values in a subset of documents, with a key defined by a dot-syntax path.
collection.distinct(
"food.allergies",
filter={"registered_for_dinner": True},
)
Parameters:
Name | Type | Summary |
---|---|---|
|
|
The name of the field whose value is inspected across documents. Keys can use dot-notation to descend to deeper document levels. Example of acceptable |
|
|
A predicate expressed as a dictionary according to the Data API filter syntax. Examples are |
|
|
A timeout, in milliseconds, for the operation. This method uses the collection-level timeout by default. |
Returns:
List[Any]
- A list of the distinct values encountered. Documents that lack the requested key are ignored.
Example response
['home_appliance', None, 'sports_equipment', {'cat_id': 54, 'cat_name': 'gardening_gear'}]
For details on the behavior of "distinct" in conjunction with real-time changes in the collection contents, see the discussion in the Sort examples values section.
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.my_collection
collection.insert_many(
[
{"name": "Marco", "food": ["apple", "orange"], "city": "Helsinki"},
{"name": "Emma", "food": {"likes_fruit": True, "allergies": []}},
]
)
collection.distinct("name")
# prints: ['Marco', 'Emma']
collection.distinct("city")
# prints: ['Helsinki']
collection.distinct("food")
# prints: ['apple', 'orange', {'likes_fruit': True, 'allergies': []}]
collection.distinct("food.1")
# prints: ['orange']
collection.distinct("food.allergies")
# prints: []
collection.distinct("food.likes_fruit")
# prints: [True]
For more information, see the Client reference.
const unique = await collection.distinct('category');
Get the distinct values in a subset of documents, with a key defined by a dot-syntax path.
const unique = await collection.distinct(
'food.allergies',
{ registeredForDinner: true },
);
Parameters:
Name | Type | Summary |
---|---|---|
|
|
The name of the field whose value is inspected across documents. Keys can use dot-notation to
descend to deeper document levels. Example of acceptable key values: |
|
A filter to select the documents to use. If not provided, all documents will be used. |
Returns:
Promise<Flatten<(SomeDoc & ToDotNotation<FoundDoc<Schema>>)[Key]>[]>
- A promise which resolves to the
unique distinct values.
The return type is mostly accurate, but with complex keys, it may be required to manually cast the return type to the expected type.
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Reference an untyped collection
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { keyspace: 'KEYSPACE' });
const collection = db.collection('COLLECTION');
(async function () {
// Insert some documents
await collection.insertOne({ name: 'Marco', food: ['apple', 'orange'], city: 'Helsinki' });
await collection.insertOne({ name: 'Emma', food: { likes_fruit: true, allergies: [] } });
// ['Marco', 'Emma']
await collection.distinct('name')
// ['Helsinki']
await collection.distinct('city')
// ['apple', 'orange', { likes_fruit: true, allergies: [] }]
await collection.distinct('food')
// ['orange']
await collection.distinct('food.1')
// []
await collection.distinct('food.allergies')
// [true]
await collection.distinct('food.likes_fruit')
})();
Gets the distinct values of the specified field name.
// Synchronous
DistinctIterable<T,F> distinct(String fieldName, Filter filter, Class<F> resultClass);
DistinctIterable<T,F> distinct(String fieldName, Class<F> resultClass);
// Asynchronous
CompletableFuture<DistinctIterable<T,F>> distinctAsync(String fieldName, Filter filter, Class<F> resultClass);
CompletableFuture<DistinctIterable<T,F>> distinctAsync(String fieldName, Class<F> resultClass);
Parameters:
Name | Type | Summary |
---|---|---|
|
|
The name of the field on which project the value. |
|
Criteria list to filter the document. The filter is a JSON object that can contain any valid Data API filter expression. |
|
|
|
The type of the field we are working on |
Returns:
DistinctIterable<F>
- List of distinct values of the specified field name.
Example:
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.DistinctIterable;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.FindIterable;
import com.datastax.astra.client.model.FindOptions;
import static com.datastax.astra.client.model.Filters.lt;
import static com.datastax.astra.client.model.Projections.exclude;
import static com.datastax.astra.client.model.Projections.include;
public class Distinct {
public static void main(String[] args) {
// Given an existing collection
Collection<Document> collection = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Building a filter
Filter filter = Filters.and(
Filters.gt("field2", 10),
lt("field3", 20),
Filters.eq("field4", "value"));
// Execute a find operation
DistinctIterable<Document, String> result = collection
.distinct("field", String.class);
DistinctIterable<Document, String> result2 = collection
.distinct("field", filter, String.class);
// Iterate over the result
for (String fieldValue : result) {
System.out.println(fieldValue);
}
}
}
This operation has no literal equivalent in HTTP.
Instead, you can use [find-documents-using-filter-options], and then use jq
or another utility to extract _id
or other desired values from the response.