Integrate Presto with Astra DB Serverless
You can use Presto to explore and query data stored in your Astra DB Serverless databases with SQL.
This integration works by sending SQL queries from a Presto client to your Presto server. The Presto server retrieves data from Astra, computes the query results, and then returns them to the Presto client. To connect to Astra, this integration uses the following:
-
Presto Cassandra connector: A wrapper for a Apache Cassandra® Java driver.
-
cql-proxy: A utility that can enable secure connections to Astra in Cassandra drivers and similar connectors that don’t include built-in support for Astra.
Prerequisites
-
Familiarity with SQL, Presto, and proxy server configuration.
-
An Astra DB Serverless database with table data.
This integration supports table data. It may not work as expected with vector-enabled collections.
-
An application token with a role that permits reading the database, at minimum.
Optional: Create demo tables
You can create sample tables to test this integration:
-
Create a keyspace in your database.
-
Use
cqlshto select your keyspace:USE KEYSPACE; -
Create the demo tables:
CREATE TABLE customer ( id UUID, name TEXT, email TEXT, PRIMARY KEY (id) ); CREATE TABLE accounts_by_customer ( customer_id UUID, account_number TEXT, account_type TEXT, account_balance DECIMAL, customer_name TEXT STATIC, PRIMARY KEY ((customer_id), account_number) ); -
Insert sample data:
INSERT INTO customer (id,name,email) VALUES (8d6c1271-16b6-479d-8ea9-546c37381ab3,'Alice','alice@example.org'); INSERT INTO customer (id,name,email) VALUES (0e5d9e8c-2e3b-4576-8515-58b491cb859e,'Bob','bob@example.org'); INSERT INTO accounts_by_customer (customer_id,account_number,account_type,account_balance,customer_name) VALUES (8d6c1271-16b6-479d-8ea9-546c37381ab3,'A-101','Checking',100.01,'Alice'); INSERT INTO accounts_by_customer (customer_id,account_number,account_type,account_balance,customer_name) VALUES (8d6c1271-16b6-479d-8ea9-546c37381ab3,'A-102','Savings',200.02,'Alice'); INSERT INTO accounts_by_customer (customer_id,account_number,account_type,account_balance,customer_name) VALUES (0e5d9e8c-2e3b-4576-8515-58b491cb859e,'B-101','Checking',300.03,'Bob'); INSERT INTO accounts_by_customer (customer_id,account_number,account_type,account_balance,customer_name) VALUES (0e5d9e8c-2e3b-4576-8515-58b491cb859e,'B-102','Savings',400.04,'Bob');
Deploy cql-proxy
Use the instructions in the cql-proxy README to deploy cql-proxy as close to your Presto server as possible.
Preferably, deploy both components on the same server.
-
SCB authentication
-
Token-only authentication
|
DataStax recommends SCB authentication because the SCB helps set the contact points in the Presto Cassandra connector. |
You can authenticate cql-proxy with a Secure Connect Bundle (SCB) and application token:
./cql-proxy \
--astra-bundle PATH/TO/SCB.zip \
--username token \
--password APPLICATION_TOKEN
When using an SCB, username is the literal string token.
You can authenticate cql-proxy with your application token and database ID:
./cql-proxy \
--astra-token APPLICATION_TOKEN \
--astra-database-id DB_ID
Deploy your Presto server and Cassandra connector
-
Install and configure a Presto server or use an existing deployment.
-
Configure the Presto Cassandra connector in
etc/catalog/cassandra.properties.The following configuration uses the Presto Cassandra connector to interact with
cql-proxyand connect to Astra:connector.name=cassandra cassandra.contact-points=localhost cassandra.consistency-level=QUORUMFor information about supported consistency levels, see Database limits.
-
Start or restart your Presto server:
bin/launcher run -
Wait for the
SERVER STARTEDmessage before continuing.
Run SQL queries with a Presto client
-
Install a Presto client. The following steps use the Presto CLI.
-
Start the CLI and connect to your Presto server with your
cassandracatalog:./presto --server http://localhost:8080 --catalog cassandraThe
--serveroption specifies the address and port of the Presto coordinator, and thecatalogoption sets the default catalog. -
Run an SQL query against your database.
The following examples use this integration’s optional demo tables:
Example: Get total number of customers
QuerySELECT COUNT(*) AS customer_count FROM banking_db.customer;Resultcustomer_count ---------------- 2 (1 row)Example: Get email addresses for customers with a balance of 300 or more
QuerySELECT DISTINCT email AS customer_email FROM banking_db.customer INNER JOIN banking_db.accounts_by_customer ON (id = customer_id) WHERE account_balance >= 300.00;Resultcustomer_email ----------------- bob@example.org (1 row)Example: Get customers and the total balance of their accounts
QuerySELECT id AS customer_id, name AS customer_name, email AS customer_email, SUM ( CAST ( COALESCE(account_balance,0) AS DECIMAL(12,2) ) ) AS customer_funds FROM banking_db.customer LEFT OUTER JOIN banking_db.accounts_by_customer ON (id = customer_id) GROUP BY id, name, email;Resultcustomer_id | customer_name | customer_email | customer_funds --------------------------------------+---------------+-------------------+---------------- 8d6c1271-16b6-479d-8ea9-546c37381ab3 | Alice | alice@example.org | 300.03 0e5d9e8c-2e3b-4576-8515-58b491cb859e | Bob | bob@example.org | 700.07 (2 rows)