Collections reference
Collections store documents in namespaces in Serverless (Vector) databases.
With the Data API, use the Database
class to manage collections and the Collection
class to work with the data in collections.
You can create up to five collections in each Serverless (Vector) database.
Prerequisites
-
If you haven’t done so already, create a Serverless (Vector) database.
-
If you use a Data API client, instantiate a
DataAPIClient
object and connect to your database.
Create a collection
Create a new collection in an Astra DB Serverless database.
-
Python
-
TypeScript
-
Java
-
cURL
For more information, see the API reference.
collection = database.create_collection("collection")
Create a new collection to store vector data.
from astrapy.constants import VectorMetric
collection = database.create_collection(
"vector_collection",
dimension=5,
metric=VectorMetric.COSINE,
)
Create a new collection that generates vector embeddings automatically.
To automatically generate vector embeddings, you must enable the corresponding embedding provider integration, add the embedding provider API key in the Astra KMS, and make sure your database can access the embedding provider service.
from astrapy.info import CollectionVectorServiceOptions
from astrapy.constants import VectorMetric
collection = database.create_collection(
"vector_auto_collection",
metric=VectorMetric.DOT_PRODUCT,
service=CollectionVectorServiceOptions(
provider="openai",
model_name="text-embedding-3-small",
authentication={
"providerKey": "API_KEY_NAME",
},
),
)
Create a new collection with default document IDs of type ObjectID
.
from astrapy.constants import DefaultIdType
collection = database.create_collection(
"collection_defaulting_to_objectids",
default_id_type=DefaultIdType.OBJECTID,
)
Create a new collection with only some fields indexed.
collection = database.create_collection(
"partial_indexing_collection",
indexing={"allow": ["city", "country"]},
)
Parameters:
Name | Type | Summary |
---|---|---|
name |
|
The name of the collection. |
namespace |
|
The namespace where the collection is to be created. If not specified, the database’s working namespace is used. |
dimension |
|
For vector collections, the dimension of the vectors; that is, the number of their components. If you’re not sure what dimension to set, use whatever dimension vector your embeddings model produces. |
metric |
|
The similarity metric used for vector searches. Allowed values are |
service |
|
The service definition for vector embeddings. Required for vector collections that generate embeddings automatically. This is an instance of
|
indexing |
|
Optional specification of the indexing options for the collection, in the form of a dictionary such as |
default_id_type |
|
This sets what type of IDs the API server will generate when inserting documents that do not specify their |
additional_options |
|
Any further set of key-value pairs that will be added to the "options" part of the payload when sending the Data API command to create a collection. |
check_exists |
|
Whether to run an existence check for the collection name before attempting to create the collection: If |
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. |
embedding_api_key |
|
An alternative to This parameter is not stored on the database, and it is used by the This is useful for creating collections with an embedding service without specifying an
|
collection_max_time_ms |
|
A default timeout, in milliseconds, for the duration of each operation on the collection.
Individual timeouts can be provided to each collection method call and will take precedence,
with this value being an overall default. Note that for some methods involving multiple API calls
(such as |
Returns:
Collection
- The created collection object that you can use to work with documents in the collection.
Example response
Collection(name="collection", namespace="default_keyspace", database=Database(api_endpoint="https://01234567-89ab-cdef-0123-456789abcdef-us-east1.apps.astra.datastax.com", token="AstraCS:aAbB...", namespace="default_keyspace"))
Example:
from astrapy import DataAPIClient
import astrapy
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
# Create a non-vector collection
collection_simple = database.create_collection("collection")
# Create a vector collection
collection_vector = database.create_collection(
"vector_collection",
dimension=3,
metric=astrapy.constants.VectorMetric.COSINE,
)
# Create a collection with UUIDv6 as default IDs
from astrapy.constants import DefaultIdType, SortDocuments
collection_uuid6 = database.create_collection(
"uuid6_collection",
default_id_type=DefaultIdType.UUIDV6,
)
collection_uuid6.insert_one({"desc": "a document", "seq": 0})
collection_uuid6.insert_one({"_id": 123, "desc": "another", "seq": 1})
doc_ids = [
doc["_id"]
for doc in collection_uuid6.find({}, sort={"seq": SortDocuments.ASCENDING})
]
print(doc_ids)
# Will print: [UUID('1eef29eb-d587-6779-adef-45b95ef13497'), 123]
print(doc_ids[0].version)
# Will print: 6
For more information, see the API reference.
const collection = await db.createCollection('COLLECTION');
Create a new collection to store vector data.
const collection = await db.createCollection<Schema>('COLLECTION', {
vector: {
dimension: 5,
metric: 'cosine',
},
checkExists: false,
});
Create a new collection that generates vector embeddings automatically.
To automatically generate vector embeddings, you must enable the corresponding embedding provider integration, add the embedding provider API key in the Astra KMS, and make sure your database can access the embedding provider service.
const collection = await db.createCollection<Schema>('COLLECTION', {
vector: {
metric: 'dot_product',
service: {
provider: 'openai',
modelName: 'text-embedding-3-small',
authentication: {
providerKey: 'API_KEY_NAME',
},
},
},
checkExists: false,
});
A Collection
is typed as Collection<Schema>
where Schema
is the type of the documents in the collection.
Operations on the collection will be strongly typed if a specific schema is provided, otherwise remained
largely weakly typed if no type is provided, which may be preferred for dynamic data access & operations.
It’s up to the user to ensure that the provided type truly represents the documents in the collection.
Parameters:
Name | Type | Summary |
---|---|---|
collectionName |
|
The name of the collection to create. |
vector? |
The options for creating the collection.
|
Options (CreateCollectionOptions
):
Name | Type | Summary |
---|---|---|
The vector configuration for the collection, e.g. vector dimension & similarity metric. If not set, collection will not support vector search. If you’re not sure what dimension to set, use whatever dimension vector your embeddings model produces. |
||
The indexing configuration for the collection. |
||
The defaultId configuration for the collection, for when a document does not specify an |
||
|
Overrides the namespace where the collection is created. If not set, the database’s working namespace is used. |
|
|
Whether to run an existence check for the collection name before attempting to create the collection. If it is Else, if it’s |
|
|
An alternative to |
|
|
Maximum time in milliseconds the client should wait for the operation to complete. |
Returns:
Promise<Collection<Schema>>
- A promise that resolves to the created collection object.
Example:
import { DataAPIClient, VectorDoc } from '@datastax/astra-db-ts';
// Get a new Db instance
const db = new DataAPIClient('TOKEN').db('API_ENDPOINT');
// Define the schema for the collection
interface User extends VectorDoc {
name: string,
age?: number,
}
(async function () {
// Create a basic untyped non-vector collection
const users1 = await db.createCollection('users');
await users1.insertOne({ name: 'John' });
// Typed collection with custom options in a non-default namespace
const users2 = await db.createCollection<User>('users', {
namespace: 'NAMESPACE',
defaultId: {
type: 'objectId',
},
vector: {
dimension: 5,
metric: 'cosine',
},
});
await users2.insertOne({ name: 'John' }, { sort: { $vector: [.12, .62, .87, .16, .72] } });
})();
For more information, see the API reference.
// Given db
Database object, create a new collection
// Create simple collection with given name.
Collection<Document> simple1 = db
.createCollection(String collectionName);
Collection<MyBean> simple2 = db
.createCollection(String collectionName, Class<MyBean> clazz);
// Create collections with vector options
Collection<Document> vector1 = createCollection(
String collectionName,
int dimension,
SimilarityMetric metric);
Collection<MyBean> vector2 = createCollection(
String collectionName,
int dimension,
SimilarityMetric metric,
Class<MyBean> clazz);
// Full-Fledged CollectionOptions with a builder
Collection<Document> full1 = createCollection(
String collectionName,
CollectionOptions collectionOptions);
Collection<MyBean> full2 = createCollection(
String collectionName,
CollectionOptions collectionOptions,
Class<MyBean> clazz);
Parameters:
Name | Type | Summary |
---|---|---|
|
|
The name of the collection. |
|
|
The dimension for the vector in the collection. If you’re not sure what dimension to set, use whatever dimension vector your embeddings model produces. |
|
|
The similarity metric to use for vector search: |
|
|
Fine-grained settings with vector, embedding provider, model name, authentication, indexing, and |
|
|
Working with specialized beans for the collection and not the default |
Example:
package com.datastax.astra.client.database;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.Database;
import com.datastax.astra.client.model.CollectionIdTypes;
import com.datastax.astra.client.model.CollectionOptions;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.SimilarityMetric;
public class CreateCollection {
public static void main(String[] args) {
Database db = new Database("API_ENDPOINT", "TOKEN");
// Create a non-vector collection
Collection<Document> simple1 = db.createCollection("col");
Collection<Document> vector1 = db
.createCollection("vector1", 14, SimilarityMetric.DOT_PRODUCT);
// Create a vector collection
Collection<Document> vector2 = db.createCollection("vector2", CollectionOptions
.builder()
.vectorDimension(1536)
.vectorSimilarity(SimilarityMetric.EUCLIDEAN)
.build());
// Create a collection with indexing (deny)
Collection<Document> indexing1 = db.createCollection("indexing1", CollectionOptions
.builder()
.indexingDeny("blob")
.build());
// Create a collection with indexing (allow) - cannot use allow and denay at the same time
Collection<Document> allow1 = db.createCollection("allow1", CollectionOptions
.builder()
.indexingAllow("metadata")
.build());
// Enforce default id type could be objectid, uuid, uuivv6, uuidv7
Collection<Document> defaultId = db.createCollection("defaultId", CollectionOptions
.builder()
.defaultIdType(CollectionIdTypes.OBJECT_ID)
.build());
}
}
curl -s --location \
--request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE} \
--header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
"createCollection": {
"name": "vector_collection",
"options": {
"defaultId": {
"type": "objectId"
},
"vector": {
"dimension": 5,
"metric": "cosine"
},
"indexing": {
"allow": ["*"]
}
}
}
}' | jq
# '| jq' is optional.
Properties:
Name | Type | Summary |
---|---|---|
|
command |
The Data API command that specifies a new collection is to be created. It acts as a container for all the attributes and settings required to create the new collection. |
|
string |
The name of the new collection. A string value that uniquely identifies the collection within the database. |
|
Optional[string] |
Options for the collection, such as configuration for vector search. Required to create a vector-enabled collection. |
|
Optional[string] |
Controls how the Data API will allocate a new |
|
String |
Required if |
|
Optional[int] |
The dimension for vector search in the collection. If you’re not sure what dimension to set, use whatever dimension vector your embeddings model produces. |
|
Optional[string] |
The similarity metric to use for vector search: |
|
Optional[string] |
The service provider for vector embeddings. Required for vector collections that generate embeddings automatically. |
|
Optional[string] |
The model name for vector embeddings. |
|
Optional[string] |
Authenticate with the embeddings provider using an API key in the Astra DB KMS.
Alternatively, you can provide the embeddings provider key directly in an |
|
Optional[string] |
Determine which properties are indexed during subsequent update operations. If indexing is specified on |
|
[array] |
The |
|
[array] |
The |
Response
{
"status": {
"ok": 1
}
}
The defaultId option
The Data API defaultId
option controls how the Data API will allocate a new _id
for each document that does not specify a value in the request.
For backwards compatibility with Data API releases before version 1.0.3, if you omit a defaultId
option on createCollection
, a document’s _id
value is a plain String version of random-based UUID (version 4).
Once the collection has been created, you cannot change the |
If you include a defaultId
option with createCollection
, you must set the type
. The capitalization is case-sensitive. Specify one of the following:
Type | Meaning |
---|---|
|
Each document’s generated |
|
Each document’s generated |
|
Each document’s |
|
Each document’s generated |
Example:
{
"createCollection": {
"name": "vector_collection2",
"options": {
"defaultId": {
"type": "objectId"
},
"vector": {
"dimension": 1024,
"metric": "cosine"
}
}
}
}
When you add documents to your collection, using Data API commands such as insertOne
and insertMany
, you would not specify an explicitly numbered _id
value (such as "_id": "12"
) in the request. The server allocates a unique value per document based on the type
you indicated in the createCollection
command’s defaultId
option.
Client apps can detect the use of $objectId
or $uuid
in the response document and return to the caller the objects that represent the types natively. In this way, client apps can use generated IDs in the methods that are based on Data API operations such as findOneAndUpdate
, updateOne
, updateMany
.
For example, in Python, the client can specify the detected value for a document’s $objectId
or $uuid
:
# API Response with $objectId
{
"_id": {"$objectId": "57f00cf47958af95dca29c0c"}
"summary": "Retrieval-Augmented Generation is the process of optimizing the output of a large language model..."
}
# Client returns Dict from collection.find_one()
my_doc = {
"_id": astrapy.ObjectId("57f00cf47958af95dca29c0c"),
"summary": "Retrieval-Augmented Generation is the process of optimizing the output of a large language model..."
}
# API Response with $uuid
{
"_id": {"$uuid": "ffd1196e-d770-11ee-bc0e-4ec105f276b8"}
"summary": "Retrieval-Augmented Generation is the process of optimizing the output of a large language model..."
}
# Client returns Dict from collection.find_one()
my_doc = {
"_id": UUID("ffd1196e-d770-11ee-bc0e-4ec105f276b8"),
"summary": "Retrieval-Augmented Generation is the process of optimizing the output of a large language model..."
}
There are many advantages when using generated _id
values with documents, versus relying on manually numbered _id
values. For example, with generated _id
values of type uuidv7
:
-
Uniqueness across the database: A generated
_id
value is designed to be globally unique across the entire database. This uniqueness is achieved through a combination of timestamp, machine identifier, process identifier, and a sequence number. Explicitly numbering documents might lead to clashes unless carefully managed, especially in distributed systems. -
Automatic generation: The
_id
values are automatically generated by Astra DB Serverless. This means you won’t have to worry about creating and maintaining a unique ID system, reducing the complexity of the code and the risk of errors. -
Timestamp information: A generated
_id
value includes a timestamp as its first component, representing the document’s creation time. This can be useful for tracking when a document was created without needing an additional field. In particular, typeuuidv7
values provide a high degree of granularity (milliseconds) in timestamps. -
Avoids manual sequence management: Managing sequential numeric IDs manually can be challenging, especially in environments with high concurrency or distributed systems. There’s a risk of ID collision or the need to lock tables or sequences to generate a new ID, which can affect performance. Generated
_id
values are designed to handle these issues automatically.
While numeric _id
values might be simpler and more human-readable, the benefits of using generated _id
values make it a superior choice for most applications, especially those that have many documents.
The indexing option
The Data API createCollection
command includes an optional indexing
clause.
If you omit the indexing
option, by default all properties in the document are indexed when it is added or modified in the database.
The index is implemented as a Storage-Attached Index (SAI), which enables Data API queries that filter and/or sort data based on the indexed property.
If you specify the indexing
option when you create a collection, you must include one (but not both) of the following: an allow
or a deny
array.
Pros and cons of selective indexing
It’s important to emphasize the pros and cons of allowing only certain properties to be indexed. While you may want to skip indexing certain properties to increase write-time performance, you’ll need to think ahead — when you create the collection — about which properties will be important to use in subsequent queries that rely on filtering and/or sort operations. You can only filter and/or sort the properties that have been indexed. Data API returns an error if you attempt to filter or sort a non-indexed property.
The error would have one of these formats:
UNINDEXED_FILTER_PATH("Unindexed filter path"), ...
UNINDEXED_SORT_PATH("Unindexed sort path"), ...
ID_NOT_INDEXED("_id is not indexed"), ...
Example:
UNINDEXED_FILTER_PATH("Unindexed filter path: The filter path ('address.city') is not indexed)"
While weighing the pros and cons of indexed or non-indexed properties in a document, consider the maximum size limits for those properties. Non-indexed properties allow for a much larger quantity of data, to accommodate data such as a blog post’s String content. In comparison, indexed properties are appropriately bound by lower maximum size limits to ensure efficient and performant read operations via the SAI index. You’ll want to evaluate the pros and cons for each property in a document, and make decisions with the Of course, test your app’s performance with the database including average and peak loads. If you need to adjust |
Indexing allow example
cURL example:
curl -s --location \
--request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE} \
--header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
"createCollection": {
"name": "vector_collection",
"options": {
"vector": {
"dimension": 5,
"metric": "cosine"
},
"indexing": {
"allow": [
"property1",
"property2"
]
}
}
}
}' | jq
# '| jq' is optional.
In the preceding allow
example, only the values of property1
and property2
are included in the SAI index. No other properties are indexed.
The net result for subsequent update operations:
Property name |
Indexed? |
|
Yes |
|
Yes |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
As a result, subsequent Data API queries may perform filtering and/or sort operations based only on property1
, property2
, or both.
Indexing deny example
Now let’s take an inverse approach with an indexing
… deny
array example in cURL:
curl -s --location \
--request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE} \
--header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
"createCollection": {
"name": "vector_collection",
"options": {
"vector": {
"dimension": 5,
"metric": "cosine"
},
"indexing": {
"deny": [
"property1",
"property3",
"property5.prop5b"
]
}
}
}
}' | jq
# '| jq' is optional.
In the preceding example, all the properties in the document are indexed except the ones listed in the deny
clause.
Notice how the parent property3
was specified, which means its sub-properties property3.prop3a
and property3.prop3b
are also not indexed.
However, also notice how the specific sub-property named property5.prop5b
was listed on the deny
clause; which means property5.prop5b
is not indexed, but the parent property5
and the sub-properties property5.prop5a
and property5.prop5c
are included in the SAI index.
The net result for subsequent update operations:
Property name |
Indexed? |
|
No |
|
Yes |
|
No |
|
No |
|
No |
|
Yes |
|
Yes |
|
Yes |
|
No |
|
Yes |
Indexing wildcard examples
The createCollection
command’s optional indexing
clause provides a convenience wildcard ["*"]
in its syntax. For example, in cURL, the following clause means that all properties will be indexed:
{
"indexing": {
"allow": ["*"]
}
}
The preceding example is the equivalent of omitting the indexing
clause. Meaning, all properties in the document will be indexed during update operations.
You can use the wildcard character with the deny
clause:
{
"indexing": {
"deny": ["*"]
}
}
In the preceding example, no properties are indexed, not even $vector
.
Find a collection
Get a reference to an existing collection.
-
Python
-
TypeScript
-
Java
For more information, see the API reference.
collection = database.get_collection("vector_collection")
The example above is equivalent to these two alternate notations:
collection1 = database["vector_collection"]
collection2 = database.vector_collection
The |
Most See the AsyncCollection API reference for details about the async API. |
Parameters:
Name | Type | Summary |
---|---|---|
name |
|
The name of the collection. |
namespace |
|
The namespace containing the collection. If no namespace is specified, the general setting for this database is used. |
embedding_api_key |
|
An optional API key that is passed to the Data API with each request in the form of an If you instantiated the collection with |
collection_max_time_ms |
|
A default timeout, in milliseconds, for the duration of each operation on the collection.
Individual timeouts can be provided to each collection method call and will take precedence,
with this value being an overall default. Note that for some methods involving multiple API calls
(such as |
Returns:
Collection
- An instance of the Collection class corresponding to the specified collection name.
Example response
Collection(name="vector_collection", namespace="default_keyspace", database=Database(api_endpoint="https://01234567-89ab-cdef-0123-456789abcdef-us-east1.apps.astra.datastax.com", token="AstraCS:aAbB...", namespace="default_keyspace"))
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.get_collection("my_collection")
collection.count_documents({}, upper_bound=100) # will print e.g.: 41
For more information, see the API reference.
const collection = db.collection('COLLECTION');
The |
A Collection
is typed as Collection<Schema>
where Schema
is the type of the documents in the collection.
Operations on the collection will be strongly typed if a specific schema is provided, otherwise remained
largely weakly typed if no type is provided, which may be preferred for dynamic data access & operations.
It’s up to the user to ensure that the provided type truly represents the documents in the collection.
Parameters:
Name | Type | Summary |
---|---|---|
collectionName |
|
The name of the collection to create. |
|
An alternative to |
|
options? |
Allows you to override which namespace to use for the collection. |
Returns:
Collection<Schema>
- An unverified reference to the collection.
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Get a new Db instance
const db = new DataAPIClient('TOKEN').db('API_ENDPOINT');
// Define the schema for the collection
interface User {
name: string,
age?: number,
}
(async function () {
// Basic untyped collection
const users1 = db.collection('users');
await users1.insertOne({ name: 'John' });
// Typed collection from different namespace with a specific embedding API key
const users2 = db.collection<User>('users', {
namespace: 'NAMESPACE',
embeddingApiKey: 'EMBEDDINGS_API_KEY',
});
await users2.insertOne({ name: 'John' });
})();
See also:
For more information, see the API reference.
// Given db
Database object, list all collections
Collection<Document> collection = db.getCollection("collection_name");
// Gather collection information
CollectionOptions options = collection.getOptions();
Returns:
CollectionOptions
- The Collection with all metadata (defaultId, vector, indexing) for the collection.
Example:
package com.datastax.astra.client.database;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.Database;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.CollectionOptions;
public class FindCollection {
public static void main(String[] args) {
Database db = new Database("TOKEN", "API_ENDPOINT");
// Find a collection
Collection<Document> collection = db.getCollection("collection_vector1");
// Gather collection information
CollectionOptions options = collection.getOptions();
// Check if a collection exists
boolean collectionExists = db.getCollection("collection_vector2").exists();
}
}
Find all collections
Retrieve an iterable object over collections. Unless otherwise specified, this implementation refers to the collections in the working namespace of the database.
-
Python
-
TypeScript
-
Java
-
cURL
-
CLI
For more information, see the API reference.
collection_iterable = database.list_collections()
Parameters:
Name | Type | Summary |
---|---|---|
namespace |
|
the namespace to be inspected. If not specified, the database’s working namespace is used. |
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. |
Returns:
CommandCursor[CollectionDescriptor]
- An iterable over CollectionDescriptor objects.
Example response
# (output below reformatted with indentation for clarity) # (a single example collection descriptor from the cursor is shown) [ ..., CollectionDescriptor( name='my_collection', options=CollectionOptions( vector=CollectionVectorOptions( dimension=3, metric='dot_product' ), indexing={'allow': ['field']} ) ), ... ]
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
coll_cursor = database.list_collections()
coll_cursor # this looks like: CommandCursor("https://....astra.datastax.com", alive)
list(coll_cursor) # [CollectionDescriptor(name='my_v_col', ...), ...]
for coll_desc in database.list_collections():
print(coll_desc)
# will print:
# CollectionDescriptor(name='my_v_col', options=CollectionOptions(vector=CollectionVectorOptions(dimension=3, metric='dot_product', service=None)))
# ...
For more information, see the API reference.
const collections = await db.listCollections();
Parameters:
Name | Type | Summary |
---|---|---|
options |
Options regarding listing collections. |
Options (ListCollectionsOptions
):
Name | Type | Summary |
---|---|---|
|
If true, only the name of the collection is returned. Else, the full information for each collection is returned. Defaults to true. |
|
|
The namespace to be inspected. If not specified, the database’s working namespace is used. |
|
|
Maximum time in milliseconds the client should wait for the operation to complete. |
Returns:
Promise<FullCollectionInfo[]>
- A promise that resolves to an array of full collection information objects.
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Get a new Db instance
const db = new DataAPIClient('TOKEN').db('API_ENDPOINT');
(async function () {
// Gets full info about all collections in db
const collections = await db.listCollections();
for (const collection of collections) {
console.log(`Collection '${collection.name}' has default ID type '${collection.options.defaultId?.type}'`);
}
})();
For more information, see the API reference.
// Given db
Database object, list all collections
Stream<CollectionInfo> collection = listCollections();
Returns:
Stream<CollectionInfo>
- The definition elements of collections.
Example:
package com.datastax.astra.client.database;
import com.datastax.astra.client.Database;
import com.datastax.astra.client.model.CollectionInfo;
import java.util.stream.Stream;
public class ListCollections {
public static void main(String[] args) {
Database db = new Database("TOKEN", "API_ENDPOINT");
// Get collection Names
Stream<String> collectionNames = db.listCollectionNames();
// Get Collection information (with options)
Stream<CollectionInfo> collections = db.listCollections();
collections.map(CollectionInfo::getOptions).forEach(System.out::println);
}
}
curl -s --location \
--request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE} \
--header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
"findCollections": {
"options": {
"explain": true
}
}
}' | jq
# '| jq' is optional.
Parameters:
Name | Type | Summary |
---|---|---|
findCollections |
command |
The Data API command to find all collections in the database. It acts as a container for all the attributes and settings required to find collections. |
options |
string |
Under this key, an additional setting for |
explain |
boolean |
When set to |
Response
{
"status": {
"collections": [
{
"name": "vector_collection",
"options": {
"defaultId": {
"type": "objectId"
},
"vector": {
"dimension": 5,
"metric": "cosine"
},
"indexing": {
"allow": [
"*"
]
}
}
}
]
}
}
To list all collections in a database, use the following command:
astra db list-collections DATABASE_NAME
Parameters:
Name | Type | Summary |
---|---|---|
db_name |
|
The name of the database |
Example output:
+---------------------+-----------+-------------+ | Name | Dimension | Metric | +---------------------+-----------+-------------+ | collection_simple | | | | collection_vector | 14 | cosine | | msp | 1536 | dot_product | +---------------------+-----------+-------------+
List collection names
Get the names of the collections as a list of strings. Unless otherwise specified, this refers to the collections in the namespace the database is set to use.
-
Python
-
TypeScript
-
Java
-
cURL
-
CLI
For more information, see the API reference.
database.list_collection_names()
Get the names of the collections in a specified namespace of the database.
database.list_collection_names(namespace="that_other_namespace")
Parameters:
Name | Type | Summary |
---|---|---|
namespace |
|
the namespace to be inspected. If not specified, the database’s working namespace is used. |
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. |
Returns:
List[str]
- A list of the collection names, in no particular order.
Example response
['a_collection', 'another_col']
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
database.list_collection_names()
# ['a_collection', 'another_col']
For more information, see the API reference.
const collectionNames = await db.listCollections({ nameOnly: true });
Get the names of the collections in a specified namespace of the database.
const collectionNames = await db.listCollections({ nameOnly: true, namespace: 'NAMESPACE' });
Parameters:
Name | Type | Summary |
---|---|---|
options |
Options regarding listing collections. |
Options (ListCollectionsOptions
):
Name | Type | Summary |
---|---|---|
|
If true, only the name of the collection is returned. Else, the full information for each collection is returned. Defaults to true. |
|
|
The namespace to be inspected. If not specified, the database’s working namespace is used. |
|
|
Maximum time in milliseconds the client should wait for the operation to complete. |
Returns:
Promise<string[]>
- A promise that resolves to an array of the collection names.
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Get a new Db instance
const db = new DataAPIClient('TOKEN').db('API_ENDPOINT');
(async function () {
// Gets just names of all collections in db
const collections = await db.listCollections({ nameOnly: true });
for (const collectionName of collections) {
console.log(`Collection '${collectionName}' exists`);
}
})();
For more information, see the API reference.
// Given db
Database object, list all collections
Stream<String> collection = listCollectionsNames();
Returns:
Stream<String>
- The names of the collections.
Example:
package com.datastax.astra.client.database;
import com.datastax.astra.client.Database;
import com.datastax.astra.client.model.CollectionInfo;
import java.util.stream.Stream;
public class ListCollections {
public static void main(String[] args) {
Database db = new Database("TOKEN", "API_ENDPOINT");
// Get collection Names
Stream<String> collectionNames = db.listCollectionNames();
// Get Collection information (with options)
Stream<CollectionInfo> collections = db.listCollections();
collections.map(CollectionInfo::getOptions).forEach(System.out::println);
}
}
curl -s --location \
--request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE} \
--header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
"findCollections": {
"options": {
"explain": true
}
}
}' | jq
# '| jq' is optional.
Parameters:
Name | Type | Summary |
---|---|---|
findCollections |
command |
The Data API command to find all collections in the database. It acts as a container for all the attributes and settings required to find collections. |
options |
string |
Under this key, an additional setting for |
explain |
boolean |
When set to |
Response
{
"status": {
"collections": [
{
"name": "vector_collection",
"options": {
"defaultId": {
"type": "objectId"
},
"vector": {
"dimension": 5,
"metric": "cosine"
},
"indexing": {
"allow": [
"*"
]
}
}
}
]
}
}
To list all collections in a database, use the following command:
astra db list-collections DATABASE_NAME | cut -b 1-23
Parameters:
Name | Type | Summary |
---|---|---|
db_name |
|
The name of the database |
Example output:
+---------------------+ | Name | +---------------------+ | collection_simple | | collection_vector | | msp | +---------------------+
Drop a collection
Drop (delete) a collection from a database, erasing all data stored in it as well.
-
Python
-
TypeScript
-
Java
-
cURL
For more information, see the API reference.
result = db.drop_collection(name_or_collection="vector_collection")
Calling this method is equivalent to invoking the collection’s own method |
Parameters:
Name | Type | Summary |
---|---|---|
name_or_collection |
|
either the name of a collection or a |
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. |
Returns:
Dict
- A dictionary in the form {"ok": 1}
if the method succeeds.
Example response
{'ok': 1}
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
database.list_collection_names()
# prints: ['a_collection', 'my_v_col', 'another_col']
database.drop_collection("my_v_col") # {'ok': 1}
database.list_collection_names()
# prints: ['a_collection', 'another_col']
For more information, see the API reference.
const ok = await db.dropCollection('COLLECTION');
Calling this method is equivalent to invoking the collection’s own method |
Parameters:
Name | Type | Summary |
---|---|---|
name |
|
The name of the collection to delete. |
options? |
Allows you to override the namespace & set a |
Options (DropCollectionOptions
):
Name | Type | Summary |
---|---|---|
|
The namespace containing the collection. If not specified, the database’s working namespace is used. |
|
|
Maximum time in milliseconds the client should wait for the operation to complete. |
Returns:
Promise<boolean>
- A promise that resolves to true if the collection was dropped successfully.
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Get a new Db instance
const db = new DataAPIClient('TOKEN').db('API_ENDPOINT');
(async function () {
// Uses db's default namespace
const success1 = await db.dropCollection('users');
console.log(success1); // true
// Overrides db's default namespace
const success2 = await db.dropCollection('users', {
namespace: 'NAMESPACE'
});
console.log(success2); // true
})();
For more information, see the API reference.
// Given db
Database object, list all collections
void db.dropCollection("collectionName");
Parameters:
Name | Type | Summary |
---|---|---|
|
|
The name of the collection to delete. |
Example:
package com.datastax.astra.client.database;
import com.datastax.astra.client.Database;
public class DropCollection {
public static void main(String[] args) {
Database db = new Database("API_ENDPOINT", "TOKEN");
// Delete an existing collection
db.dropCollection("collection_vector2");
}
}
curl -s --location \
--request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE} \
--header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
"deleteCollection": {
"name": "vector_collection"
}
}' | jq
# '| jq' is optional.
Response
{
"status": {
"ok": 1
}
}
Parameter:
Name | Type | Summary |
---|---|---|
|
|
The name of the collection to delete. |