BLOB type

Use the blob type for binary data.

Although you can store binary-encoded vector data in a blob column, the Data API you will not be able to create a vector index or perform a vector search on the column. Instead, use the vector type to store vector data.

Due to a known issue with filtering on blob, DataStax does not recommend using blob in primary keys.

  • Python

  • TypeScript

  • Java

  • curl

You can insert binary data to a blob column as a Base64-encoded string with $binary or as a bytes value.

The Python client returns binary data received from the Data API as a bytes value, even if it was inserted as a Base64-encoded string.

from astrapy import DataAPIClient

# Get an existing table
client = DataAPIClient()
database = client.get_database("API_ENDPOINT", token="APPLICATION_TOKEN")
table = database.get_table("TABLE_NAME")

# Insert binary values
result = table.insert_one(
    {
        "id": "1234",
        "example_blob": b"=\xfb\xe7m>\xe9x\xd5?I\xfb\xe7",
        "another_example_blob": {"$binary": "PfvnbT7peNU/Sfvn"},
    }
)

You can insert binary data as a DataAPIBlob object. The DataAPIBlob class and its blob shorthand accept DataAPIBlobLike input, such as ArrayBuffer and { $binary: string }.

The TypeScript client returns binary data received from the Data API as a DataAPIBlob object. The DataAPIBlob class provides methods like asArrayBuffer() for accessing the binary data.

import { DataAPIClient, DataAPIBlob } from "@datastax/astra-db-ts";

// Get an existing table
const client = new DataAPIClient();
const database = client.db("API_ENDPOINT", {
  token: "APPLICATION_TOKEN",
});
const table = database.table("TABLE_NAME");

// Insert binary values
(async function () {
  const result = await table.insertOne({
    id: "1234",
    example_blob: new DataAPIBlob({ $binary: "PfvnbT7peNU/Sfvn" }),
    another_example_blob: new DataAPIBlob(Buffer.from([0x0, 0x1, 0x2])),
  });
})();

You can insert binary data as a byte array.

The Java client automatically converts byte[] values into a Base64-encoded string.

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.commands.results.TableInsertOneResult;
import com.datastax.astra.client.tables.definition.rows.Row;

public class Example {

  public static void main(String[] args) {
    // Get and existing table
    Table<Row> table =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getTable("TABLE_NAME");

    // Insert binary values
    byte[] exampleBytes = {
      (byte) 0x3D, (byte) 0xFB, (byte) 0xE7, (byte) 0x6D,
      (byte) 0x3E, (byte) 0xE9, (byte) 0x78, (byte) 0xD5,
      (byte) 0x3F, (byte) 0x49, (byte) 0xFB, (byte) 0xE7
    };

    Row row = new Row().addText("id", "1234").addBlob("example_blob", exampleBytes);

    TableInsertOneResult result = table.insertOne(row);
  }
}

You can insert binary data as a Base64-encoded string with $binary.

Vector binary encodings specification

A d-dimensional vector is a list of d floating-point numbers that can be binary encoded.

To prepare for encoding, the list must be transformed into a sequence of bytes where each float is represented as four bytes in big-endian format. Then, the byte sequence is Base64-encoded, with = padding, if needed. For example, here are some vectors and their resulting Base64 encoded strings:

[0.1, -0.2, 0.3] = "PczMzb5MzM0+mZma"
[0.1, 0.2] = "PczMzT5MzM0="
[10, 10.5, 100, -91.19] = "QSAAAEEoAABCyAAAwrZhSA=="

Once encoded, you use $binary to pass the Base64 string to the Data API:

{ "$binary": "BASE64_STRING" }

You can use a script to encode your vectors, for example:

python
import base64
import struct

input_vector = [0.1, -0.2, 0.3]
d = len(input_vector)
pack_format = ">" + "f" * d
binary_encode = base64.b64encode(struct.pack(pack_format, *input_vector)).decode()
curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/TABLE_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "insertOne": {
    "document": {
      "id": "1234",
      "example_blob" : {"$binary": "PfvnbT7peNU/Sfvn"}
    }
  }
}'

Was this helpful?

Give Feedback

How can we improve the documentation?

© Copyright IBM Corporation 2025 | Privacy policy | Terms of use Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: Contact IBM