$vector and $vectorize in collections

When working with documents in the Astra Portal or Data API, there are two reserved fields for vector data: $vector and $vectorize.

Which fields you can use depends on the collection configuration.

Embedding generation methods

When you create a collection, you decide if the collection can store structured vector data. This is known as a vector-enabled collection. For vector-enabled collections, you also decide how to provide embeddings. You must decide which options you need when you create the collection:

  • For all vector-enabled collections, you can provide embeddings when you load data (also known as bring your own embeddings).

  • You can configure the collection to automatically generate embeddings with vectorize (the $vectorize reserved field).

    You can’t use $vectorize in a collection where you did not enable vectorize when you created the collection. If you want to use vectorize at all, then you must enable vectorize when you create the collection.

  • If you enable vectorize, you can use both options interchangeably but not simultaneously. For example, you can use vectorize to generate embeddings for a batch of documents, and then insert a few documents with pre-generated embeddings.

    To bring your own embeddings to a collection that uses vectorize, when you insert a document, include the document’s embedding in the $vector field.

    It is critical that all embeddings in a collection are generated by the same model with the same dimensions, regardless of whether you use vectorize, bring your own embeddings, or both.

    Astra DB only checks that the dimensions are the same; it does not produce an error if the embeddings are from different models. You must ensure that the embeddings are compatible. Using mismatched embeddings produces unreliable and incorrect results in vector searches.

  • For all vector-enabled collections, you can insert non-vector data.

Reserved fields

$vector

The $vector parameter is a reserved field that stores vectors.

To bring your own embeddings when you insert documents, include $vector for each document that has an embedding.

If the collection uses vectorize, you have the option to omit $vector when you insert documents. You can use $vectorize to generate an embedding, and then Astra DB populates the document’s $vector field with the automatically generated embedding. Alternatively, if you want to bring your own embeddings to a collection that uses vectorize, you can include the $vector field when you insert documents.

Regardless of the embedding generation method, when you find, update, replace, or delete documents, you can use $vector to fetch documents by vector search. You can also use projections to include $vector in responses.

$vectorize

The $vectorize parameter is a reserved field that generates embeddings automatically based on a given text string.

You can’t use $vectorize in a collection where you did not enable vectorize when you created the collection. If you want to use vectorize at all, then you must enable vectorize when you create the collection.

If the collection uses vectorize, you have the option to include this parameter when you insert documents. The value of $vectorize is the text string from which you want to generate a document’s embedding. Make sure the vectorize text string is compliant with the embedding provider’s requirements, such a token size. Astra DB stores the resulting vector array in $vector.

When you find, update, replace, or delete documents in a collection that uses vectorize, you can use $vectorize to fetch documents by vector search with vectorize. You can also use projections to include $vectorize in responses.

For information about vectorize integrations and troubleshooting vectorize, see Auto-generate embeddings with vectorize.

$vector and $vectorize are excluded by default from Data API responses. You can use projections to include these properties in responses.

Insert non-vector data in a vector-enabled collection

To insert a document that doesn’t need an embedding, then you can omit $vector and $vectorize. When using the Astra Portal to load JSON or CSV data into a collection that uses vectorize, make sure the Vector Field is set to None (no embeddings).

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2025 DataStax | Privacy policy | Terms of use | Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com