Delete data from Cassandra table, using data from the RDD as a list of primary keys.
Delete data from Cassandra table, using data from the RDD as a list of primary keys. Uses the specified column names. By default, it deletes all columns from corresponding Cassandra rows.
Example:
CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1 }; CREATE TABLE test.words(word VARCHAR PRIMARY KEY, count INT, other VARCHAR); INSERT INTO test.words((word,count,other) values ('foo', 5, 'foo');
case class WordCount(word: String, count: Int, other: String) val rdd = sc.cassandraTable("test", "words")
The underlying RDD class must provide data for all primary key columns. Delete "other" column values
rdd.deleteFromCassandra("test", "words", Seq("other"))
This delete consistency level and other properties are the same as for writes
the name of the Keyspace to use
the name of the Table to use
The list of column names to delete, empty ColumnSelector means full row.
Primary key columns selector, Optional. All RDD primary columns columns will be checked by default
additional configuration object allowing to set consistency level, batch size, etc.
Saves the data from RDD to a Cassandra table.
Saves the data from RDD to a Cassandra table. By default, it saves all properties that have corresponding Cassandra columns.
Example:
CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1 }; CREATE TABLE test.words(word VARCHAR PRIMARY KEY, count INT, other VARCHAR);
case class WordCount(word: String, count: Int, other: String) val rdd = sc.parallelize(Seq(WordCount("foo", 5, "bar")))
By default, the underlying RDD class must provide data for all columns:
rdd.saveToCassandra("test", "words")
By default, writes are performed at ConsistencyLevel.LOCAL_QUORUM. This write consistency level is controlled by the following property:
the name of the Keyspace to use
the name of the Table to use
The list of column names to save data to. Uses only the unique column names, and you must select at least all primary key columns. All other fields are discarded. Non-selected property/column names are left unchanged.
additional configuration object allowing to set consistency level, batch size, etc.