Find distinct values

Finds the distinct values of a key for documents in a collection.

This method finds all documents that match the filter, or all documents if no filter is applied. There can be performance, latency, and billing implications if there are many matching documents.

Ready to write code? See the examples for this method to get started. If you are new to the Data API, check out the quickstart.

Result

  • Python

  • TypeScript

  • Java

  • curl

Returns a list of the distinct values of the specified key. The method excludes documents that do not include the requested key.

Example response:

['home_appliance', None, 'sports_equipment', {'cat_id': 54, 'cat_name': 'gardening_gear'}]

Returns a promise that resolves to a list of the distinct values of the specified key. Documents that do not include the requested key are ignored.

The TypeScript client will attempt to infer the return type. However, you may need to explicitly cast the return type to match the expected type.

Example resolved response:

['home_appliance', None, 'sports_equipment', {'cat_id': 54, 'cat_name': 'gardening_gear'}]

Returns a list of the distinct values of the specified key as a Set. Documents that do not include the requested key are ignored.

This method has no literal equivalent in HTTP. Instead, you can use Find documents, and then use jq or another utility to extract the desired values from the response.

Parameters

  • Python

  • TypeScript

  • Java

  • curl

Use the distinct method, which belongs to the astrapy.Collection class.

Method signature
distinct(
  key: str | Iterable[str | int],
  *,
  filter: Dict[str, Any],
  general_method_timeout_ms: int,
  request_timeout_ms: int,
  timeout_ms: int,
) -> list[Any]
Name Type Summary

key

str | Iterable[str | int]

The field or subfield for which to find values.

To find distinct values for a nested field, use a list or use dot notation in a string. To find distinct values for a specific index in an array, use the index in a list or in a string with dot notation. If a list is encountered and no index is specified, the method visits all items in the list.

To escape literal . in field names, use &.

Examples:

  • "f": matches both {"f": <value>} and {"f": [<value>, …​]}.

  • "f.g": matches {"f": {"g": <value>}} but not {"f.g": <value>}.

  • "f&.g": matches {"f.g": <value>} but not {"f": {"g": <value>}}.

  • "f.0": matches both {"f": {"0": <value>}} and {"f": [<value>, …​]}.

  • ["f", "g"]: matches {"f": {"g": <value>}} but not {"f.g": <value>}.

  • ["f.g"]: matches `{"f.g": <value>} but not {"f": {"g": <value>}}.

  • ["f", "0"]: matches {"f": {"0": <value>}} but not {"f": [<value>, …​]}.

  • ["f", 0]: matches {"f": [<value>, …​]} but not {"f": {"0": <value>}}.

filter

Dict[str, Any]

An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria.

For a list of available filter operators and more examples, see Filter operators for collections.

Filters can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in a filter.

general_method_timeout_ms

int

Optional. The maximum time, in milliseconds, that the whole operation, which may involve multiple HTTP requests, can take.

Default: The default value for the collection. This default is 30 seconds unless you specified a different default when you initialized the Collection or DataAPIClient object. For more information, see Timeout options.

This parameter is aliased as timeout_ms for convenience.

request_timeout_ms

int

Optional. The maximum time, in milliseconds, that the client should wait for each underlying HTTP request.

Default: The default value for the collection. This default is 10 seconds unless you specified a different default when you initialized the Collection object. For more information, see Timeout options.

Use the distinct method, which belongs to the Collection class.

Method signature
async distinct(
  key: Key,
  filter: CollectionFilter<Schema>,
  options: {
    timeout?: number | TimeoutDescriptor,
  },
): Flatten<(SomeDoc & ToDotNotation<Schema>)[Key]>[]
Name Type Summary

key

string

The name of the field for which to find values.

To find distinct values for a nested field, use dot notation. For example, field.subfield.subsubfield.

To use dot notation for a list, specify a numeric index. For example, field.3. If a list is encountered and no numeric index is specified, the method visits all items in the list.

filter

CollectionFilter<Schema>

An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria.

For a list of available filter operators and more examples, see Filter operators for collections.

Filters can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in a filter.

options

object

Optional. The options for this operation. See the options table for more details.

Properties of options
Name Type Summary

timeout

number | TimeoutDescriptor

Optional.

The timeout(s) to apply to this method. You can specify requestTimeoutMs and generalMethodTimeoutMs. Since this method issues a single HTTP request, these timeouts are equivalent.

Details about the timeout parameter

The TimeoutDescriptor object can contain these properties:

  • requestTimeoutMs (number): The maximum time, in milliseconds, that the client should wait for each underlying HTTP request. Default: The default value for the collection. This default is 10 seconds unless you specified a different default when you initialized the Collection or DataAPIClient object.

  • generalMethodTimeoutMs (number): The maximum time, in milliseconds, that the whole operation can take. Since this method issues a single HTTP request, generalMethodTimeoutMs and requestTimeoutMs are equivalent. If you specify both, the minimum of the two will be used. Default: The default value for the collection. This default is 30 seconds unless you specified a different default when you initialized the Collection or DataAPIClient object.

If you specify a number instead of a TimeoutDescriptor object, that number will be applied to both requestTimeoutMs and generalMethodTimeoutMs.

Use the distinct method, which belongs to the com.datastax.astra.client.Collection class.

Method signature
<R> Set<R> distinct(
  String key,
  Class<F> resultClass
)
<R> Set<R> distinct(
  String key,
  Filter filter,
  Class<F> resultClass
)
Name Type Summary

key

String

The name of the field for which to find values.

The Java client does not support dot notation to find distinct values for nested fields.

filter

Filter

An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria.

For a list of available filter operators and more examples, see Filter operators for collections.

Filters can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in a filter.

resultClass

Class

The type of the values that you expect this method to return.

This method has no literal equivalent in HTTP. Instead, you can use Find documents, and then use jq or another utility to extract the desired values from the response.

Examples

The following examples demonstrate how to find distinct values of a key for documents in a collection.

Find distinct values of a top level field

  • Python

  • TypeScript

  • Java

  • curl

from astrapy import DataAPIClient

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
  "ASTRA_DB_API_ENDPOINT",
  token="ASTRA_DB_APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find distinct values
result = collection.distinct("publication_year")

print(result)
import { DataAPIClient } from '@datastax/astra-db-ts';

// Get an existing collection
const client = new DataAPIClient('ASTRA_DB_APPLICATION_TOKEN');
const database = client.db('ASTRA_DB_API_ENDPOINT');
const collection = database.collection('COLLECTION_NAME');

(async function () {
  // Find distinct values
  const result = await collection.distinct("publication_year");

  console.log(result)
})();
package com.examples;

import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.query.Filter;
import com.datastax.astra.client.core.query.Filters;
import java.util.Set;

public class FindDistinct {

    public static void main(String[] args) {
        // Get an existing collection
        Collection<Document> collection = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
            .getDatabase("ASTRA_DB_API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

        // Find distinct values
        Set<String> result = collection
            .distinct("author", String.class);

        for (String fieldValue : result) {
            System.out.println(fieldValue);
        }
    }
}

This method has no literal equivalent in HTTP. Instead, you can use Find documents, and then use jq or another utility to extract the desired values from the response.

Find distinct values of a nested field

To find distinct values for a nested field, use dot notation. For example, field.subfield.subsubfield.

To use dot notation for a list, specify a numeric index. For example, field.3. If a list is encountered and no numeric index is specified, the method visits all items in the list.

  • Python

  • TypeScript

  • Java

  • curl

from astrapy import DataAPIClient

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
  "ASTRA_DB_API_ENDPOINT",
  token="ASTRA_DB_APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find distinct values
result = collection.distinct(["metadata", "language"])

# Equivalently:
# result = collection.distinct("metadata.language")

print(result)
import { DataAPIClient } from '@datastax/astra-db-ts';

// Get an existing collection
const client = new DataAPIClient('ASTRA_DB_APPLICATION_TOKEN');
const database = client.db('ASTRA_DB_API_ENDPOINT');
const collection = database.collection('COLLECTION_NAME');

(async function () {
  // Find distinct values
  const result = await collection.distinct("metadata.language");

  console.log(result)
})();

The Java client does not support dot notation to find distinct values for nested fields.

This method has no literal equivalent in HTTP. Instead, you can use Find documents, and then use jq or another utility to extract the desired values from the response.

Find distinct values for a subset of documents

You can use a filter to find distinct values across documents that match the filter.

For a list of available filter operators and more examples, see Filter operators for collections.

Filters can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in a filter.

  • Python

  • TypeScript

  • Java

  • curl

from astrapy import DataAPIClient

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
  "ASTRA_DB_API_ENDPOINT",
  token="ASTRA_DB_APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find distinct values
result = collection.distinct(
  "publication_year",
  filter={
    "$and": [
      {"is_checked_out": False},
      {"number_of_pages": {"$lt": 300}},
    ]
  }
)

print(result)
import { DataAPIClient } from '@datastax/astra-db-ts';

// Get an existing collection
const client = new DataAPIClient('ASTRA_DB_APPLICATION_TOKEN');
const database = client.db('ASTRA_DB_API_ENDPOINT');
const collection = database.collection('COLLECTION_NAME');

(async function () {
  // Find distinct values
  const result = await collection.distinct(
    "publication_year",
    {
      "$and": [
        {"is_checked_out": false},
        {"number_of_pages": {"$lt": 300}},
      ]
    }
  );

  console.log(result)
})();
package com.examples;

import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.query.Filter;
import com.datastax.astra.client.core.query.Filters;
import java.util.Set;

public class FindDistinct {

    public static void main(String[] args) {
        // Get an existing collection
        Collection<Document> collection = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
            .getDatabase("ASTRA_DB_API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

        // Find distinct values
        Filter filter = Filters.and(
          Filters.eq("is_checked_out", false),
          Filters.lt("number_of_pages", 300));
        Set<String> result = collection
            .distinct("author", filter, String.class);

        for (String fieldValue : result) {
            System.out.println(fieldValue);
        }
    }
}

This method has no literal equivalent in HTTP. Instead, you can use Find documents, and then use jq or another utility to extract the desired values from the response.

Client reference

  • Python

  • TypeScript

  • Java

  • curl

For more information, see the client reference.

For more information, see the client reference.

For more information, see the client reference.

Client reference documentation is not applicable for HTTP.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2025 DataStax | Privacy policy | Terms of use | Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com