Query database data using Apache Spark SQL in Scala
When you start Apache Spark™, DataStax Enterprise (DSE) creates a Spark session instance to allow you to run Spark SQL queries against database tables. The session object is named spark and is an instance of org.apache.spark.sql.SparkSession. Use the sql method to execute the query.
Procedure
-
Start the Spark shell.
dse spark
-
Use the sql method to pass in the query, storing the result in a variable.
val results = spark.sql("SELECT * from my_keyspace_name.my_table")
-
Use the returned data.
results.show()
+--------------------+-----------+ | id|description| +--------------------+-----------+ |de2d0de1-4d70-11e...| thing| |db7e4191-4d70-11e...| another| |d576ad50-4d70-11e...|yet another| +--------------------+-----------+