Creating a Spark Structured Streaming sink using DSE

Spark Structured Streaming is a high-level API for streaming applications. DSE supports Structured Streaming for storing data into DSE.

The following Scala example shows how to store data from a streaming source to DSE using the cassandraFormat method.

val query = source.writeStream
  .option("checkpointLocation", checkpointDir.toString)
  .cassandraFormat("<table name>", "<keyspace name>")
  .outputMode(OutputMode.Update)
  .start()

This example sets the OutputMode to Update, described in the Spark API documentation.

The cassandraFormat method is equivalent to calling the format method and in org.apache.spark.sql.cassandra.

val query = source.writeStream
  .option("checkpointLocation", checkpointDir.toString)
  .format("org.apache.spark.sql.cassandra")
    .option("keyspace", ks)
    .option("table", "kv")
  .outputMode(OutputMode.Update)
  .start()

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com