Ways to insert data in Astra DB Serverless
You can insert data into Astra DB Serverless databases programmatically and in the Astra Portal. Astra DB also supports options for bulk uploads and migrations.
Permissions required to insert data
To insert data into an Astra DB Serverless database, you must be able to read and write to the target database, keyspace, and collection or table.
To insert data in the Astra Portal, a valid role, such as the Database Administrator role, must be assigned directly to you.
To insert data with the Data API, you need an application token with a valid role.
To insert data with the standalone CQL shell or a driver, use your database’s Secure Connect Bundle (SCB) for authentication and authorization.
Content requirements for CSV and JSON files
If you insert data from a CSV or JSON file, the data must be compatible with Astra DB and, if applicable, the table schema.
For example, if you insert a CSV file into a table, the CSV file must contain the same column names and data types as the table.
If you insert a JSON file exported from a database that isn’t based on Apache Cassandra®, you might need to transform the data into a format that is compatible with Astra DB before you insert the data. For more information, see Migrate from non-Cassandra sources.
Additionally, if your CSV or JSON file is larger than 40 MB, see Migrate or insert large amounts of data.
Content requirements for collections
Collections contain pieces of data known as documents, which are similar to rows in a table. Each document consists of one or more properties or fields.
Regardless of how you choose to insert data, the following requirements apply to documents inserted into collections.
Rules for field names in documents
A document can contain user-defined and reserved fields.
User-defined field names must follow these rules:
-
Must start and end with a letter or an underscore
-
Can contain letters, numbers, underscores, and hyphens
-
Must have a length of 1 to 100 characters
-
Cannot match the name of a reserved field
Reserved fields are tied to specific functionality. Include the following reserved fields in your documents, if applicable:
-
_id
: An optional unique identifier for the document. If_id
is omitted, it is created automatically based on the collection’s ID type. For more information, see Document IDs. -
$vector
: An optional array of numbers representing a vector embedding for vector search. The$vector
field is only supported for vector-enabled collections. A document cannot contain both a$vector
and a$vectorize
field. For more information, see $vector in collections. -
$vectorize
: An optional string from which to generate vector embeddings for vector search. The$vectorize
field is only supported for collections that have an embedding provider integration. A document cannot contain both a$vector
and a$vectorize
field. For more information, see $vectorize in collections. -
$lexical
: An optional string of space-separated keywords or terms to make the document searchable for the lexical search component of hybrid search. The$lexical
field is only supported for collections that have lexical search enabled. For more information, see Create a collection. -
$hybrid
: An optional string that populates both$vectorize
and$lexical
. The$hybrid
shorthand is only supported for collections that have vectorize and lexical search enabled. If a document uses$hybrid
, it cannot contain a root-level$vectorize
or$lexical
field.
If you insert data in the Astra Portal, reserved fields can be automatically detected or applied to your documents.
For example, if your collection has an embedding provider integration, you can select a field to designate as the |
Maximum limits for fields and documents
Documents and fields are subject to the limits described in Data API limits.