Integrate Presto with Astra DB Serverless
You can use Presto to explore and query data stored in your Astra DB Serverless databases with SQL.
This integration works by sending SQL queries from a Presto client to your Presto server. The Presto server retrieves data from Astra, computes the query results, and then returns them to the Presto client. To connect to Astra, this integration uses the following:
-
Presto Cassandra connector: A wrapper for a Apache Cassandra® Java driver.
-
cql-proxy
: A utility that can enable secure connections to Astra in Cassandra drivers and similar connectors that don’t include built-in support for Astra.
Prerequisites
-
Familiarity with SQL, Presto, and proxy server configuration.
-
An Astra DB Serverless database with table data.
This integration supports table data. It may not work as expected with vector-enabled collections.
-
An application token with a role that permits reading the database, at minimum.
Optional: Create demo tables
You can create sample tables to test this integration:
-
Create a keyspace in your database.
-
Use
cqlsh
to select your keyspace:USE KEYSPACE;
-
Create the demo tables:
CREATE TABLE customer ( id UUID, name TEXT, email TEXT, PRIMARY KEY (id) ); CREATE TABLE accounts_by_customer ( customer_id UUID, account_number TEXT, account_type TEXT, account_balance DECIMAL, customer_name TEXT STATIC, PRIMARY KEY ((customer_id), account_number) );
-
Insert sample data:
INSERT INTO customer (id,name,email) VALUES (8d6c1271-16b6-479d-8ea9-546c37381ab3,'Alice','alice@example.org'); INSERT INTO customer (id,name,email) VALUES (0e5d9e8c-2e3b-4576-8515-58b491cb859e,'Bob','bob@example.org'); INSERT INTO accounts_by_customer (customer_id,account_number,account_type,account_balance,customer_name) VALUES (8d6c1271-16b6-479d-8ea9-546c37381ab3,'A-101','Checking',100.01,'Alice'); INSERT INTO accounts_by_customer (customer_id,account_number,account_type,account_balance,customer_name) VALUES (8d6c1271-16b6-479d-8ea9-546c37381ab3,'A-102','Savings',200.02,'Alice'); INSERT INTO accounts_by_customer (customer_id,account_number,account_type,account_balance,customer_name) VALUES (0e5d9e8c-2e3b-4576-8515-58b491cb859e,'B-101','Checking',300.03,'Bob'); INSERT INTO accounts_by_customer (customer_id,account_number,account_type,account_balance,customer_name) VALUES (0e5d9e8c-2e3b-4576-8515-58b491cb859e,'B-102','Savings',400.04,'Bob');
Deploy cql-proxy
Use the instructions in the cql-proxy
README to deploy cql-proxy
as close to your Presto server as possible.
Preferably, deploy both components on the same server.
-
SCB authentication
-
Token-only authentication
DataStax recommends SCB authentication because the SCB helps set the contact points in the Presto Cassandra connector. |
You can authenticate cql-proxy
with a Secure Connect Bundle (SCB) and application token:
./cql-proxy \
--astra-bundle PATH/TO/SCB.ZIP \
--username token \
--password APPLICATION_TOKEN
When using an SCB, username
is the literal string token
.
You can authenticate cql-proxy
with your application token and database ID:
./cql-proxy \
--astra-token APPLICATION_TOKEN \
--astra-database-id DB_ID
Deploy your Presto server and Cassandra connector
-
Install and configure a Presto server or use an existing deployment.
-
Configure the Presto Cassandra connector in
etc/catalog/cassandra.properties
.The following configuration uses the Presto Cassandra connector to interact with
cql-proxy
and connect to Astra:connector.name=cassandra cassandra.contact-points=localhost cassandra.consistency-level=QUORUM
For information about supported consistency levels, see Database limits.
-
Start or restart your Presto server:
bin/launcher run
-
Wait for the
SERVER STARTED
message before continuing.
Run SQL queries with a Presto client
-
Install a Presto client. The following steps use the Presto CLI.
-
Start the CLI and connect to your Presto server with your
cassandra
catalog:./presto --server http://localhost:8080 --catalog cassandra
The
--server
option specifies the address and port of the Presto coordinator, and thecatalog
option sets the default catalog. -
Run an SQL query against your database.
The following examples use this integration’s optional demo tables:
Example: Get total number of customers
QuerySELECT COUNT(*) AS customer_count FROM banking_db.customer;
Resultcustomer_count ---------------- 2 (1 row)
Example: Get email addresses for customers with a balance of 300 or more
QuerySELECT DISTINCT email AS customer_email FROM banking_db.customer INNER JOIN banking_db.accounts_by_customer ON (id = customer_id) WHERE account_balance >= 300.00;
Resultcustomer_email ----------------- bob@example.org (1 row)
Example: Get customers and the total balance of their accounts
QuerySELECT id AS customer_id, name AS customer_name, email AS customer_email, SUM ( CAST ( COALESCE(account_balance,0) AS DECIMAL(12,2) ) ) AS customer_funds FROM banking_db.customer LEFT OUTER JOIN banking_db.accounts_by_customer ON (id = customer_id) GROUP BY id, name, email;
Resultcustomer_id | customer_name | customer_email | customer_funds --------------------------------------+---------------+-------------------+---------------- 8d6c1271-16b6-479d-8ea9-546c37381ab3 | Alice | alice@example.org | 300.03 0e5d9e8c-2e3b-4576-8515-58b491cb859e | Bob | bob@example.org | 700.07 (2 rows)