Querying database data using Apache Spark™ SQL in Scala
When you start Spark, DataStax Enterprise creates a Spark session instance to allow you to run Spark SQL queries against database tables.
When you start Spark, DataStax Enterprise creates a Spark session instance to allow you to run Spark SQL queries against database tables.
The session object is named spark
and is an instance of org.apache.spark.sql.SparkSession
.
Use the sql
method to execute the query.
Procedure
-
Start the Spark shell.
dse spark
-
Use the
sql
method to pass in the query, storing the result in a variable.val results = spark.sql("SELECT * from my_keyspace_name.my_table")
-
Use the returned data.
results.show()
+--------------------+-----------+ | id|description| +--------------------+-----------+ |de2d0de1-4d70-11e...| thing| |db7e4191-4d70-11e...| another| |d576ad50-4d70-11e...|yet another| +--------------------+-----------+