Accessing database data from Spark
DataStax Enterprise integrates Spark with DataStax Enterprise database. Database tables are fully usable from Spark.
DataStax Enterprise integrates Spark with DataStax Enterprise database. Database tables are fully usable from Spark.
Accessing the database from a Spark application
To access the database from a Spark application, follow instructions in the Spark example Portfolio Manager demo using Spark.Accessing database data from the Spark shell
DataStax Enterprise uses the Spark Cassandra Connector to provide database integration for Spark. By running the Spark shell in DataStax Enterprise, you have access to enriched Spark context objects for accessing transactional nodes directly. See the Spark Cassandra Connector Java Doc on GitHub.
To access database data from the Spark Shell, just run the dse spark
command and follow instructions in subsequent sections.
$ dse spark
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.0.0.1
/_/
Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_91)
Type in expressions to have them evaluated.
Type :help for more information.
scala>
The Spark Shell creates a default Spark session named spark
, an instance
of org.apache.spark.sql.SparkSession.
The Spark Shell creates two contexts by default: sc (an instance of org.apache.spark.SparkContext) and sqlContext (an instance of org.apache.spark.sql.hive.HiveContext).
val hc=sqlContext
Previous versions also created a CassandraSqlContext instance named csc. Starting in DSE 5.0, this is no longer the case. Use the sqlContext object instead.