Using AlwaysOn SQL service
AlwaysOn SQL is a high availability service that responds to SQL queries from JDBC and ODBC applications. By default, AlwaysOn SQL is disabled. It is built on top of the Spark SQL Thriftserver, but provides failover and caching between instances so there is no single point of failure. AlwaysOn SQL provides enhanced security, leveraging the same user management as the rest of DSE, executing queries to the underlying database as the user authenticated to AlwaysOn SQL.
In order to run AlwaysOn SQL, you must have:
A running datacenter with DSE Analytics nodes enabled.
Enabled AlwaysOn SQL on every Analytics node in the datacenter.
Modified the replication factor for all Analytics nodes, if necessary.
cassandra.yamlto an IP address accessible by the AlwaysOn SQL clients. This address depends on your network topology and deployment scenario.
Configured AlwaysOn SQL for security, if authentication is enabled.
Ensured you have sufficient resources allocated to start AlwaysOn SQL on your cluster.
Lifecycle Manager allows you to enable and configure AlwaysOn SQL in managed clusters.
When AlwaysOn SQL is enabled within an Analytics datacenter, all nodes within the datacenter must have AlwaysOn SQL enabled. Use dsetool ring to find which nodes in the datacenter are Analytics nodes.
AlwaysOn SQL is not supported when using DSE Multi-Instance or other deployments with multiple DSE instances on the same server.
dse client-tool alwayson-sql command controls the server.
The command works on the local datacenter unless you specify the datacenter with the
dse client-tool alwayson-sql --dc <datacenter_name> <command>
alwayson_sql_options section in
dse.yaml, described in detail at AlwaysOn SQL, has options for setting the ports, timeout values, log location, and other Spark or Hive configuration settings.
Additional configuration options are located in
AlwaysOn SQL binds to the
If you have changed some configuration settings in
dse.yaml while AlwaysOn SQL is running, you can have the server pick up the new configuration by entering:
dse client-tool alwayson-sql reconfig
The following settings can be changed using
Changing other options requires a
restart, except for the
Enabling or disabling AlwaysOn SQL requires restarting DSE.
spark-alwayson-sql.conf file contains Spark and Hive settings as properties.
When AlwaysOn SQL is started,
spark-alwayson-sql.conf `is scanned for Spark properties, similar to other Spark applications started with `dse spark-submit.
Properties that begin with
spark.hive are submitted as properties using
--hiveconf, removing the
For example, if
spark-alwayson-sql.conf has the following setting:
That setting will be converted to
--hiveconf hive.server2.table.type.mapping=CLASSIC when AlwaysOn SQL is started.
Configuring AlwaysOn SQL service with DSE Analytics Solo is not supported.
If you have enabled AlwaysOn SQL, it will start when the cluster is started. If AlwaysOn SQL is enabled and DSE is restarted, AlwaysOn SQL will be started regardless of the previous state of AlwaysOn SQL. You only need to explicitly start the server if it has been stopped, for example for a configuration change.
To start AlwaysOn SQL service:
dse client-tool alwayson-sql start
To start the server on a specific datacenter, specify the datacenter name with the
dse client-tool alwayson-sql --dc dc-west start
To completely stop AlwaysOn SQL service:
dse client-tool alwayson-sql stop
The server must be manually started after issuing a
To restart a running server:
dse client-tool alwayson-sql restart
To find the status of AlwaysOn SQL issue a
status command using
dse client-tool alwayson-sql status
You can also view the status in a web browser by going to
http://<node name or IP address>:<AlwaysOn SQL web UI port>.
By default, the port is 9077.
For example, if 10.10.10.1 is the IP address of an Analytics node with AlwaysOn SQL enabled, navigate to
The returned status is one of:
RUNNING: the server is running and ready to accept client requests.
STOPPED_AUTO_RESTART: the server is being started but is not yet ready to accept client requests.
STOPPED_MANUAL_RESTART: the server was stopped with either a
restartcommand. If the server was issued a
restartcommand, the status will be changed to
STOPPED_AUTO_RESTARTas the server starts again.
STARTING: the server is actively starting up but is not yet ready to accept client requests.
To increase performance, you can specify tables to be cached into RAM using the
CACHE TABLE directive.
Permanent cached tables will be recached on server restart.
You can cache an existing table by issuing a
CACHE TABLE Spark SQL command through a client:
CACHE TABLE <keyspace_name>.<table_name>;
CACHE TABLE <keyspace_name>.<table_name> AS <select statement>;
The temporary cache table is only valid for the session in which it was created, and will not be recreated on server restart.
Create a permanent cache table using the
CREATE CACHE TABLE directive and a
CREATE CACHE TABLE <keyspace_name>.<table_name> AS <select_statement>;
The table cache can be destroyed using the
UNCACHE TABLE and
CLEAR CACHE directives.
UNCACHE TABLE <keyspace_name>.<table_name>;
CLEAR CACHE directive removes the table cache.
DROP TABLE will remove all metadata including the table cache.
AlwaysOn SQL logs activity to the
By default this is
When AlwaysOn SQL server is stopped, the contents of the log file are sent to the global
system.log file, and also copied to DSEFS in the
- Enabling SSL for AlwaysOn SQL
Communication with the AlwaysOn SQL can be encrypted using SSL.
- Using authentication with AlwaysOn SQL
AlwaysOn SQL can be configured to use DSE authentication.
- Simba JDBC Driver for Apache Spark
The Simba JDBC driver allows you to access AlwaysOn SQL.
- Simba ODBC Driver for Apache Spark
The Simba ODBC Driver for Spark allows you to connect to AlwaysOn SQL.
- Connecting to AlwaysOn SQL server using Beeline
Use Spark Beeline to test AlwaysOn SQL.