Document IDs

Documents in a collection are always identified by an ID that is unique within the collection. This identifier is stored in the reserved field _id. There are multiple types of document identifiers, such as string, integer, or datetime; however, the uuid and ObjectId types are recommended. The Data API supports uuid identifiers up to version 8 and ObjectId identifiers as provided by the bson library.

Default document IDs

When you insert a document into a collection, you can either pass an explicit _id or use an automatically generated ID.

The collection’s defaultId option controls how the Data API allocates an _id for any document that doesn’t otherwise specify an _id when added to a collection.

After you create a collection, you can’t change the defaultId option.

If you omit the defaultId option on createCollection, the default type is uuid. This means that the server generates a random stringified UUIDv4 as the _id for any document without an explicit _id field. This enables backwards compatibility with Data API versions 1.0.2 and earlier.

If you include the defaultId option with createCollection, you must specify one of the following case-sensitive ID types:

  • objectId: Each document’s generated _id is an objectId.

  • uuidv6: Each document’s generated _id is a version 6 UUID. This is field-compatible with version 1 time UUIDs, and it supports lexicographical sorting.

  • uuidv7: Each document’s _id is a version 7 UUID. This is designed as a replacement for version 1 time UUID, and it is recommended for use in new systems.

  • uuid: Each document’s generated _id is a version 4 random UUID. This type is analogous to the uuid type and functions in Apache Cassandra®.

For examples of setting the default ID when creating a collection, see Create a collection.

Set document IDs when inserting documents

When you use the Data API to add documents to a collection, the _id field is optional.

If you omit the _id field, then the server generates a unique identifier for each document based on the collection’s default ID type.

If you provide an explicit _id value, then the server uses this value instead of generating an ID. If explicitly defined, the _id field must be a top-level document property. _id cannot be nested within another property.

Benefits of automatically generated document IDs

There are advantages to using generated document IDs instead of manual document IDs. For example, the advantages of generated UUIDv7 document IDs include the following:

  • Uniqueness across the database: A generated _id value is designed to be globally unique across the entire database. This uniqueness is achieved through a combination of timestamp, machine identifier, process identifier, and a sequence number. Explicitly numbering documents might lead to clashes unless carefully managed, especially in distributed systems.

  • Automatic generation: The _id values are automatically generated by Astra DB Serverless. This means you won’t have to worry about creating and maintaining a unique ID system, reducing the complexity of the code and the risk of errors.

  • Timestamp information: A generated _id value includes a timestamp as its first component, representing the document’s creation time. This can be useful for tracking when a document was created without needing an additional field. In particular, type uuidv7 values provide a high degree of granularity (milliseconds) in timestamps.

  • Avoids manual sequence management: Managing sequential numeric IDs manually can be challenging, especially in environments with high concurrency or distributed systems. There’s a risk of ID collision or the need to lock tables or sequences to generate a new ID, which can affect performance. Generated _id values are designed to handle these issues automatically.

    While numeric _id values might be simpler and more human-readable, the benefits of using generated _id values make it a superior choice for most applications, especially those that have many documents.

Other document identifiers

Regardless of the defaultId setting, the Data API honors document identifiers of any type, anywhere in a document, that you explicitly provide at any time:

  • You can include identifiers anywhere in a document, not only in the _id field.

  • You can include different types of identifiers in different parts of the same document.

  • You can define identifiers at any time, such as when inserting or updating a document.

  • You can use any of a document’s identifiers for filter clauses and update/replace operations, just like any other data type.

  • Python

  • TypeScript

  • Java

  • curl

To use and generate identifiers, astra-db-ts provides the UUID and ObjectId classes. These are not the same as those exported from the bson or uuid libraries. Instead, these are custom classes that you must import from the astra-db-ts package:

import { UUID, ObjectId } from '@datastax/astra-db-ts';

To generate new identifiers, you can use UUID.v4(), UUID.v7(), or new ObjectId():

import { DataAPIClient, UUID, ObjectId } from '@datastax/astra-db-ts';

// Schema for the collection
interface Person {
  _id: UUID | ObjectId;
  name: string;
  friendId?: UUID;
}

// Reference the DB instance
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { keyspace: 'KEYSPACE' });

(async function () {
  // Create the collection
  const collection = await db.createCollection<Person>('people');

  // Insert documents w/ various IDs
  await collection.insertOne({ name: 'John', _id: UUID.v4() });
  await collection.insertOne({ name: 'Jane', _id: new UUID('016b1cac-14ce-660e-8974-026c927b9b91') });

  await collection.insertOne({ name: 'Dan', _id: new ObjectId()});
  await collection.insertOne({ name: 'Tim', _id: new ObjectId('65fd9b52d7fabba03349d013') });

  // Update a document with a UUID in a non-_id field
  await collection.updateOne(
    { name: 'John' },
    { $set: { friendId: new UUID('016b1cac-14ce-660e-8974-026c927b9b91') } },
  );

  // Find a document by a UUID in a non-_id field
  const john = await collection.findOne({ name: 'John' });
  const jane = await collection.findOne({ _id: john!.friendId });

  // Prints 'Jane 016b1cac-14ce-660e-8974-026c927b9b91 6'
  console.log(jane?.name, jane?._id.toString(), (<UUID>jane?._id).version);
})();

All UUID methods return an instance of the same class, which exposes a version property, if you need to access it. UUIDs can also be constructed from a string representation of the IDs, if you want to use custom generation.

The Java client defines dedicated classes to support different implementations of UUID, particularly v6 and v7.

When a unique identifier is retrieved from the server, it is returned as a uuid, and then it is converted to the appropriate UUID class, based on the class definition in the defaultId option.

ObjectId classes are extracted from the BSON package, and they represent the ObjectId type. UUIDs from the Java UUID class are implemented in the UUID v4 standard.

To generate new identifiers, you can use methods like new UUIDv6(), new UUIDv7(), or new ObjectId():

package com.datastax.astra.client.collection;

import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.ObjectId;
import com.datastax.astra.client.model.UUIDv6;
import com.datastax.astra.client.model.UUIDv7;

import java.time.Instant;
import java.util.UUID;

import static com.datastax.astra.client.model.Filters.eq;
import static com.datastax.astra.client.model.Updates.set;

public class WorkingWithDocumentIds {
    public static void main(String[] args) {
        // Given an existing collection
        Collection<Document> collection = new DataAPIClient("TOKEN")
                .getDatabase("API_ENDPOINT")
                .getCollection("COLLECTION_NAME");

        // Ids can be different Json scalar
        // ('defaultId' options NOT set for collection)
        new Document().id("abc");
        new Document().id(123);
        new Document().id(Instant.now());

        // Working with UUIDv4
        new Document().id(UUID.randomUUID());

        // Working with UUIDv6
        collection.insertOne(new Document().id(new UUIDv6()).append("tag", "new_id_v_6"));
        UUID uuidv4 = UUID.fromString("018e77bc-648d-8795-a0e2-1cad0fdd53f5");
        collection.insertOne(new Document().id(new UUIDv6(uuidv4)).append("tag", "id_v_8"));

        // Working with UUIDv7
        collection.insertOne(new Document().id(new UUIDv7()).append("tag", "new_id_v_7"));

        // Working with ObjectIds
        collection.insertOne(new Document().id(new ObjectId()).append("tag", "obj_id"));
        collection.insertOne(new Document().id(new ObjectId("6601fb0f83ffc5f51ba22b88")).append("tag", "obj_id"));

        collection.findOneAndUpdate(
                eq((new ObjectId("6601fb0f83ffc5f51ba22b88"))),
                set("item_inventory_id", UUID.fromString("1eeeaf80-e333-6613-b42f-f739b95106e6")));
    }
}

When you insert a document, you can omit _id to automatically generate an ID or you can manually specify an _id, such as "_id": "12".

The following example inserts two documents with manually-defined _id values. One document uses the objectId type, and the other uses the uuid type.

"insertMany": {
  "documents": [
    {
      "_id": { "$objectId": "6672e1cbd7fabb4e5493916f" },
      "$vector": [0.1, 0.15, 0.3, 0.12, 0.05],
      "key": "value",
      "amount": 53990
    },
    {
      "_id": { "$uuid": "1ef2e42c-1fdb-6ad6-aae4-e84679831739" },
      "$vector": [0.15, 0.1, 0.1, 0.35, 0.55],
      "key": "value",
      "amount": 4600
    }
  ]
}

When you add or update a document, you can include additional identifiers in any document property, other than _id, just as you would any other data type.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2025 DataStax | Privacy policy | Terms of use | Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com