com.datastax.bdp.spark.writer.BulkTableWriter
Writes RDD data to sstables in a local temp directory and then streams the sstables to the Cassandra cluster.
Writes RDD data to sstables in a local temp directory and then streams the sstables to the Cassandra cluster. The keyspace and table must exist.
Depending on the setup this method may or may not be faster than standard saveToCassandra call.
Compared to saveToCassandra call, this method does more work on the client-side. Therefore it
uses more memory and I/O on the client, however it puts less stress on the server-side.
Use bulk saving if you experience timeouts or server-side OOMs when using saveToCassandra method.
Make sure your Spark partitions are at least several tens of MBs large,
because bulkSaveToCassandra will generate at least
one sstable per Spark partition.
Import
BulkTableWriter._to enhance your RDDs with bulk saving capability.