Cassandra Query Language (CQL) for Astra DB
Because Astra DB Serverless is powered by Apache Cassandra®, you can send Cassandra Query Language (CQL) queries to your Astra DB Serverless databases with the CQL shell (cqlsh
) or DataStax drivers.
To use CQL for Astra DB, you need an active Astra account and an active Astra DB Serverless database.
Familiarity with CQL is helpful but not required.
For more information about CQL for Astra DB, including command examples and specifications, see the CQL for Astra DB reference.
CQL for Astra DB limitations
To ensure high availability and optimum performance, Astra DB Serverless databases have guardrails on underlying Apache Cassandra® functionality, including commands and functionality available through CQL for Astra DB.
User-defined types aren’t supported
CQL for Astra DB does not support user-defined functions (UDFs) or user-defined aggregate functions (UDAs).
Unsupported values are ignored
If you try to use a CQL command with unsupported data definition language (DDL) properties, the command works, but ignores the unsupported values.
For example, the following command creates a table with the defined attributes, except for the nodesync
values:
CREATE TABLE IF NOT EXISTS library.books
(
id UUID PRIMARY KEY,
lastname text,
firstname text)
WITH nodesync={'enabled': 'true'};
The response includes a warning that lists the unsupported values, which were ignored:
Warnings :
Ignoring provided values [nodesync] as they are not supported for Table
Properties(ignored values are: [additional_write_policy,
bloom_filter_fp_chance, caching, cdc, compaction, compression,
crc_check_chance, dse_edge_label_property, dse_vertex_label_property,
gc_grace_seconds, id, max_index_interval, memtable,
memtable_flush_period_in_ms, min_index_interval, nodesync, read_repair,
speculative_retry])
Keyspace, table, and column names are case sensitive
If you create a table in the Astra Portal, the table name, keyspace name, and column names become case-sensitive. Make sure your CQL commands use case sensitivity for these values. For more information, see How do I fix a "table does not exist" error in CQL.
Unsupported CQL commands
Operations concerning keyspaces, materialized views, functions, aggregates, and search indexes are generally not supported. As a result, the following CQL commands aren’t supported for Astra DB:
CQL commands not supported for Astra DB
-
CREATE KEYSPACE
-
ALTER KEYSPACE
-
DROP KEYSPACE
-
CREATE MATERIALIZED VIEW
-
ALTER MATERIALIZED VIEW
-
DROP MATERIALIZED VIEW
-
CREATE AGGREGATE
-
DESCRIBE AGGREGATE
-
DROP AGGREGATE
-
CREATE FUNCTION
-
DESCRIBE FUNCTION
-
DROP FUNCTION
-
CREATE TRIGGER
-
DROP TRIGGER
-
CREATE ROLE
-
ALTER ROLE
-
DROP ROLE
-
LIST ROLES
-
LIST PERMISSIONS
-
RESTRICT
-
RESTRICT ROWS
-
UNRESTRICT
-
UNRESTRICT ROWS
-
CREATE SEARCH INDEX
-
COMMIT SEARCH INDEX
-
REBUILD SEARCH INDEX
-
RELOAD SEARCH INDEX
-
ALTER SEARCH INDEX SCHEMA
-
ALTER SEARCH INDEX CONFIG
-
DROP SEARCH INDEX CONFIG
For a list of supported CQL commands, see the CQL for Astra DB quick reference.
Use CQL for Astra DB
To send CQL commands to your databases, you can use the CQL shell, a DataStax driver, or the Data API.
Data API and clients
You can use the Data API and clients to manage tables in Serverless (Vector) databases, including tables you created through the CQL shell or a driver. For more information, see Work with tables.
Drivers
DataStax supports the following drivers for Astra DB Serverless:
Language | Driver | Version | Documentation |
---|---|---|---|
C++ |
|||
C# |
|||
Go |
|||
Java |
|||
Node.js |
|||
Python |
CQL shell
The Cassandra Query Language Shell (cqlsh
) is a utility that you can use to issue CQL commands to your Astra DB Serverless database.
You can use the embedded CQL shell in the Astra Portal or the standalone CQL shell.
-
Embedded CQL shell
-
Standalone CQL shell
The Astra Portal provides an embedded CQL shell instance, known as the CQL Console, for each Astra DB Serverless database. You can use the CQL Console to run CQL commands on your databases directly from your browser.
-
In the Astra Portal navigation menu, select your database.
-
Click CQL Console to open the CQL Console in a new tab.
-
Wait for the
token@cqlsh>
prompt to appear.This prompt indicates that the CQL Console is connected to your database, and you can begin issuing CQL commands to your database.
-
(Optional) For multi-region databases, you can use the region menu to access data from a secondary region. However, due to Astra DB’s eventual consistency model, changes to data in any region are eventually replicated to the database’s other regions.
The standalone CQL shell is a separate, lightweight utility that you can use to interact with your database.
You can use the Astra CLI to download and configure the standalone CQL shell, including the Secure Connect Bundle (SCB) and the token for your Astra account. For more information, see Use cqlsh in the Astra CLI documentation. |
-
Install Python version 2.7.12 or later with TLS support.
-
Download the SCB for your database.
For multi-region databases, if you want to access data from a secondary region, you must download the SCB for that specific region. However, due to Astra DB’s eventual consistency model, changes to data in any region are eventually replicated to the database’s other regions.
-
Download the CQL shell for your Astra DB Serverless database:
-
DataStax Astra
-
DataStax Astra with support for Vector Type.
-
-
Change to the directory where you downloaded the CQL shell package:
$ cd /DOWNLOAD_DIRECTORY
-
Extract the files:
$ tar -xvf cqlsh-astra-DATE-bin.tar.gz
Replace
DATE
with the package release date, such as20210304
. -
Run the
cqlsh
script:cd /cqlsh-astra ./bin/cqlsh -u CLIENT_ID -p CLIENT_SECRET -b PATH_TO_SCB
Replace the following:
-
PATH_TO_SCB
: The path to your database’s SCB zip file -
CLIENT_ID
andCLIENT_SECRET
: Either of the following sets of values from your application token:-
The
clientId
value and thesecret
value -
The literal string
token
and the token secret, prefixed byAstraCS:
-
(Optional) Store credentials in the
cqlshrc
fileIf you don’t want to pass the credentials on the command line every time, you can configure the location in your
cqlshrc
file:-
In the
~/.cassandra
directory, open thecqlshrc
file. -
In the
[authentication]
section, add theCLIENT_ID
andCLIENT_SECRET
values from your application token. -
In the
[connection]
section, add the path to the SCB.
[authentication] username = CLIENT_ID # or the literal string
token
password = CLIENT_SECRET # or the token secret prefixed by 'AstraCS:' [connection] secure_connect_bundle = PATH_TO_SCB -
-
Make sure the command returns output indicating that you connected to your database. For example:
[cqlsh 6.8.0 | DSE 6.8.0.77 | CQL spec 3.4.5 | DSE protocol v2] Use HELP for help. username@cqlsh>
Use analyzers with CQL for Astra DB
Analyzers process the text in a column to enable term matching on strings. Combined with vector-based search algorithms, term matching makes it easier to find relevant information in a table. Analyzers semantically filter the results of a vector search by specific terms.
Analyzers are built on the Lucene Java Analyzer API. Storage Attached Indexes (SAI) use the Lucene Java Analyzer API to transform text columns into tokens for indexing and querying, which can use built-in or custom analyzers.
For example, if you generate an embedding from the phrase "tell me about available shoes", you can then use a vector search to get a list of rows with similar vectors. These rows will likely correlate with shoe-related strings.
SELECT * from products
ORDER BY vector ANN OF [6.0,2.0, ... 3.1,4.0]
LIMIT 10;
Alternatively, you can filter these search results by a specific keyword, such as hiking
:
SELECT * from products
WHERE val : 'hiking'
ORDER BY vector ANN OF [6.0,2.0, … 3.1,4.0]
LIMIT 10;
To enable analyzer operations with CQL on a Serverless (Vector) database, you must create a SAI with the index_analyzer
option, and then use the :
operator to search the indexed column that has been analyzed.
An analyzed index stores values derived from the raw column values. The stored values are dependent on the analyzer configuration options, which include the following:
The analyzer determines how the column values are analyzed before indexing occurs, and the analyzer is applied to query terms as well.
Example: STANDARD tokenizer
-
Create a table:
CREATE TABLE default_keyspace.products ( id text PRIMARY KEY, val text );
-
Create an SAI index with the
index_analyzer
option and stemming enabled:CREATE CUSTOM INDEX default_keyspace_products_val_idx ON default_keyspace.products(val) USING 'org.apache.cassandra.index.sai.StorageAttachedIndex' WITH OPTIONS = { 'index_analyzer': '{ "tokenizer" : {"name" : "standard"}, "filters" : [{"name" : "porterstem"}] }'};
-
Insert sample rows:
INSERT INTO default_keyspace.products (id, val) VALUES ('1', 'soccer cleats'); INSERT INTO default_keyspace.products (id, val) VALUES ('2', 'running shoes'); INSERT INTO default_keyspace.products (id, val) VALUES ('3', 'hiking shoes');
-
Use keywords to query data from the inserted rows. The analyzer splits the text into case-independent terms.
Query on "running"SELECT * FROM default_keyspace.products WHERE val : 'running';
Query on "hiking" and "shoes"SELECT * FROM default_keyspace.products WHERE val : 'hiking' AND val : 'shoes';
Example: N-GRAM tokenizer
An ngram
tokenizer processes text by splitting the given text into contiguous sequences of n
tokens to capture the linguistic patterns and context.
This is a part of natural language processing (NLP) tasks.
To configure an ngram
tokenizer that also lowercases all tokens, ensure the "tokenizer" key specifies the Lucene tokenizer.
The remaining key-value pairs in the JSON object configure the tokenizer:
WITH OPTIONS = {
'index_analyzer': '{
"tokenizer" : {"name" : "ngram", "args" : {"minGramSize":"2", "maxGramSize":"3"}},
"filters" : [{"name" : "lowercase"}]
}'
}
For more information, see N-GRAM with lowercase.
For more information about analyzers, the :
operator, and analyzer examples, see CQL vector search with text analyzer and the Simple STANDARD analyzer example.
Non-tokenizing filters include
|
Supported built-in analyzers
CQL offers several built-in analyzers from the Lucene project.
The following built-in analyzers are available for Astra DB:
Category | Analyzers |
---|---|
standard, simple, whitespace, stop, lowercase |
|
Arabic, Armenian, Basque, Bengali, Brazilian, Bulgarian, Catalan, CJK, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Greek, Hindi, Hungarian, Indonesian, Irish, Italian, Latvian, Lithuanian, Norwegian, Persian, Portuguese, Romanian, Russian, Sorani, Spanish, Swedish |
|
standard, korean, hmmChinese, openNlp, japanese, wikipedia, letter, keyword, whitespace, classic, pathHierarchy, edgeNGram, nGram, simplePatternSplit, simplePattern, pattern, thai, uax29UrlEmail, icu |
|
htmlstrip, mapping, persian, patternreplace |
|
apostrophe, arabicnormalization, arabicstem, bulgarianstem, bengalinormalization, bengalistem, brazilianstem, cjkbigram, cjkwidth, soraninormalization, soranistem, commongramsquery, dictionarycompoundword, hyphenationcompoundword, decimaldigit, lowercase, type, uppercase, czechstem, germanlightstem, germanminimalstem, germannormalization, germanstem, greeklowercase, greekstem, englishminimalstem, englishpossessive, kstem, porterstem, spanishlightstem, persiannormalization, finnishlightstem, frenchlightstem, frenchminimalstem, irishlowercase, galicianminimalstem, galicianstem, hindinormalization, hindistem, hungarianlightstem, hunspellstem, indonesianstem, indicnormalization, italianlightstem, latvianstem, minhash, asciifolding, capitalization, codepointcount, concatenategraph, daterecognizer, delimitedtermfrequency, fingerprint, fixbrokenoffsets, hyphenatedwords, keepword, keywordmarker, keywordrepeat, length, limittokencount, limittokenoffset, limittokenposition, removeduplicates, stemmeroverride, protectedterm, trim, truncate, typeassynonym, worddelimiter, worddelimitergraph, scandinavianfolding, scandinaviannormalization, edgengram, ngram, norwegianlightstem, norwegianminimalstem, patternreplace, patterncapturegroup, delimitedpayload, numericpayload, tokenoffsetpayload, typeaspayload, portugueselightstem, portugueseminimalstem, portuguesestem, reversestring, russianlightstem, shingle, fixedshingle, serbiannormalization, classic, standard, swedishlightstem, flattengraph, turkishlowercase, elision |
Astra DB doesn’t support the following built-in analyzers:
-
synonymgraph
-
synonym
-
commongrams
-
stop
-
snowballporter
For more information and examples, see Built-in analyzers.