Querying Cassandra data using Spark SQL in Scala

You can execute Spark SQL queries in Scala by starting the Spark shell. When you start Spark, DataStax Enterprise sets the context to allow you to run Spark SQL queries against Cassandra tables.

When you start Spark, DataStax Enterprise sets the context to allow you to run Spark SQL queries against Cassandra tables. The context object is named sqlContext and is an instance of HiveContext. HiveContext is a superset of SqlContext and uses the Hive metastore. Use the setKeyspace method to connect to a Cassandra keyspace, and then use the sql method to execute the query.

Procedure

Start the Spark shell.
```
dse spark
```
Use the sql method to pass in the query, storing the result in a variable.
```
val results = sqlContext.sql("SELECT * from my_keyspace_name.my_table")
```

Use the returned data.

results.collect().foreach(println)

CassandraRow{type_id: 1, value: 9685.807}
CassandraRow{type_id: 2, value: -9775.808}

DataStax, Titan, and TitanDB are registered trademark of DataStax, Inc. and its subsidiaries in the United States and/or other countries.

Apache Cassandra, Apache, Tomcat, Lucene, Solr, Hadoop, Spark, TinkerPop, and Cassandra are trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.