BLOB type
Use the blob type for binary data.
Although you can store binary-encoded vector data in a blob column, the Data API you will not be able to create a vector index or perform a vector search on the column.
Instead, use the vector type to store vector data.
|
Due to a known issue with filtering on |
-
Python
-
TypeScript
-
Java
-
curl
You can insert binary data to a blob column as a Base64-encoded string with $binary or as a bytes value.
The Python client returns binary data received from the Data API as a bytes value, even if it was inserted as a Base64-encoded string.
from astrapy import DataAPIClient
# Get an existing table
client = DataAPIClient()
database = client.get_database("API_ENDPOINT", token="APPLICATION_TOKEN")
table = database.get_table("TABLE_NAME")
# Insert binary values
result = table.insert_one(
{
"id": "1234",
"example_blob": b"=\xfb\xe7m>\xe9x\xd5?I\xfb\xe7",
"another_example_blob": {"$binary": "PfvnbT7peNU/Sfvn"},
}
)
You can insert binary data as a DataAPIBlob object.
The DataAPIBlob class and its blob shorthand accept DataAPIBlobLike input, such as ArrayBuffer and { $binary: string }.
The TypeScript client returns binary data received from the Data API as a DataAPIBlob object.
The DataAPIBlob class provides methods like asArrayBuffer() for accessing the binary data.
import { DataAPIClient, DataAPIBlob } from "@datastax/astra-db-ts";
// Get an existing table
const client = new DataAPIClient();
const database = client.db("API_ENDPOINT", {
token: "APPLICATION_TOKEN",
});
const table = database.table("TABLE_NAME");
// Insert binary values
(async function () {
const result = await table.insertOne({
id: "1234",
example_blob: new DataAPIBlob({ $binary: "PfvnbT7peNU/Sfvn" }),
another_example_blob: new DataAPIBlob(Buffer.from([0x0, 0x1, 0x2])),
});
})();
You can insert binary data as a byte array.
The Java client automatically converts byte[] values into a Base64-encoded string.
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.commands.results.TableInsertOneResult;
import com.datastax.astra.client.tables.definition.rows.Row;
public class Example {
public static void main(String[] args) {
// Get and existing table
Table<Row> table =
new DataAPIClient("APPLICATION_TOKEN")
.getDatabase("API_ENDPOINT")
.getTable("TABLE_NAME");
// Insert binary values
byte[] exampleBytes = {
(byte) 0x3D, (byte) 0xFB, (byte) 0xE7, (byte) 0x6D,
(byte) 0x3E, (byte) 0xE9, (byte) 0x78, (byte) 0xD5,
(byte) 0x3F, (byte) 0x49, (byte) 0xFB, (byte) 0xE7
};
Row row = new Row().addText("id", "1234").addBlob("example_blob", exampleBytes);
TableInsertOneResult result = table.insertOne(row);
}
}
You can insert binary data as a Base64-encoded string with $binary.
Vector binary encodings specification
A d-dimensional vector is a list of d floating-point numbers that can be binary encoded.
To prepare for encoding, the list must be transformed into a sequence of bytes where each float is represented as four bytes in big-endian format.
Then, the byte sequence is Base64-encoded, with = padding, if needed.
For example, here are some vectors and their resulting Base64 encoded strings:
[0.1, -0.2, 0.3] = "PczMzb5MzM0+mZma" [0.1, 0.2] = "PczMzT5MzM0=" [10, 10.5, 100, -91.19] = "QSAAAEEoAABCyAAAwrZhSA=="
Once encoded, you use $binary to pass the Base64 string to the Data API:
{ "$binary": "BASE64_STRING" }
You can use a script to encode your vectors, for example:
python
import base64
import struct
input_vector = [0.1, -0.2, 0.3]
d = len(input_vector)
pack_format = ">" + "f" * d
binary_encode = base64.b64encode(struct.pack(pack_format, *input_vector)).decode()
curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/TABLE_NAME" \
--header "Token: APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"insertOne": {
"document": {
"id": "1234",
"example_blob" : {"$binary": "PfvnbT7peNU/Sfvn"}
}
}
}'