Package

com.datastax.spark

connector

Permalink

package connector

The root package of Cassandra connector for Apache Spark. Offers handy implicit conversions that add Cassandra-specific methods to SparkContext and RDD.

Call cassandraTable method on the SparkContext object to create a CassandraRDD exposing Cassandra tables as Spark RDDs.

Call RDDFunctions saveToCassandra function on any RDD to save distributed collection to a Cassandra table.

Example:

CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1 };
CREATE TABLE test.words (word text PRIMARY KEY, count int);
INSERT INTO test.words(word, count) VALUES ("and", 50);
import com.datastax.spark.connector._

val sparkMasterHost = "127.0.0.1"
val cassandraHost = "127.0.0.1"
val keyspace = "test"
val table = "words"

// Tell Spark the address of one Cassandra node:
val conf = new SparkConf(true).set("spark.cassandra.connection.host", cassandraHost)

// Connect to the Spark cluster:
val sc = new SparkContext("spark://" + sparkMasterHost + ":7077", "example", conf)

// Read the table and print its contents:
val rdd = sc.cassandraTable(keyspace, table)
rdd.toArray().foreach(println)

// Write two rows to the table:
val col = sc.parallelize(Seq(("of", 1200), ("the", "863")))
col.saveToCassandra(keyspace, table)

sc.stop()
Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. connector
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. sealed trait BatchSize extends AnyRef

    Permalink
  2. case class BytesInBatch(batchSize: Int) extends BatchSize with Product with Serializable

    Permalink
  3. final class CassandraRow extends ScalaGettableData with Serializable

    Permalink

    Represents a single row fetched from Cassandra.

    Represents a single row fetched from Cassandra. Offers getters to read individual fields by column name or column index. The getters try to convert value to desired type, whenever possible. Most of the column types can be converted to a String. For nullable columns, you should use the getXXXOption getters which convert nulls to None values, otherwise a NullPointerException would be thrown.

    All getters throw an exception if column name/index is not found. Column indexes start at 0.

    If the value cannot be converted to desired type, com.datastax.spark.connector.types.TypeConversionException is thrown.

    Recommended getters for Cassandra types:

    - ascii: getString, getStringOption - bigint: getLong, getLongOption - blob: getBytes, getBytesOption - boolean: getBool, getBoolOption - counter: getLong, getLongOption - decimal: getDecimal, getDecimalOption - double: getDouble, getDoubleOption - float: getFloat, getFloatOption - inet: getInet, getInetOption - int: getInt, getIntOption - text: getString, getStringOption - timestamp: getDate, getDateOption - timeuuid: getUUID, getUUIDOption - uuid: getUUID, getUUIDOption - varchar: getString, getStringOption - varint: getVarInt, getVarIntOption - list: getList[T] - set: getSet[T] - map: getMap[K, V]

    Collection getters getList, getSet and getMap require to explicitly pass an appropriate item type:

    row.getList[String]("a_list")
    row.getList[Int]("a_list")
    row.getMap[Int, String]("a_map")

    Generic get allows to automatically convert collections to other collection types. Supported containers: - scala.collection.immutable.List - scala.collection.immutable.Set - scala.collection.immutable.TreeSet - scala.collection.immutable.Vector - scala.collection.immutable.Map - scala.collection.immutable.TreeMap - scala.collection.Iterable - scala.collection.IndexedSeq - java.util.ArrayList - java.util.HashSet - java.util.HashMap

    Example:

    row.get[List[Int]]("a_list")
    row.get[Vector[Int]]("a_list")
    row.get[java.util.ArrayList[Int]]("a_list")
    row.get[TreeMap[Int, String]]("a_map")

    Timestamps can be converted to other Date types by using generic get. Supported date types: - java.util.Date - java.sql.Date - org.joda.time.DateTime

  4. case class CassandraRowMetadata(columnNames: IndexedSeq[String], resultSetColumnNames: Option[IndexedSeq[String]] = None, codecs: IndexedSeq[TypeCodec[AnyRef]] = null) extends Product with Serializable

    Permalink

    All CassandraRows shared data

    All CassandraRows shared data

    columnNames

    row column names

    resultSetColumnNames

    column names from java driver row result set, without connector aliases.

    codecs

    cached java driver codecs to avoid registry lookups

  5. final class CassandraTableScanPairRDDFunctions[K, V] extends Serializable

    Permalink
  6. final class CassandraTableScanRDDFunctions[R] extends Serializable

    Permalink
  7. sealed trait CollectionBehavior extends AnyRef

    Permalink

    Insert behaviors for Collections.

  8. case class CollectionColumnName(columnName: String, alias: Option[String] = None, collectionBehavior: CollectionBehavior = CollectionOverwrite) extends ColumnRef with Product with Serializable

    Permalink

    References a collection column by name with insert instructions

  9. case class ColumnName(columnName: String, alias: Option[String] = None) extends ColumnRef with Product with Serializable

    Permalink

    References a column by name.

  10. implicit final class ColumnNameFunctions extends AnyVal

    Permalink
  11. class ColumnNotFoundException extends Exception

    Permalink

    Thrown when the requested column does not exist in the result set.

  12. sealed trait ColumnRef extends AnyRef

    Permalink

    A column that can be selected from CQL results set by name.

  13. sealed trait ColumnSelector extends AnyRef

    Permalink
  14. class DatasetFunctions[K] extends Serializable

    Permalink

    Provides Cassandra-specific methods on org.apache.spark.sql.DataFrame

  15. class DseSparkExtensions extends (SparkSessionExtensions) ⇒ Unit with Logging

    Permalink
  16. case class FunctionCallRef(columnName: String, actualParams: Seq[Either[ColumnRef, String]] = Seq.empty, alias: Option[String] = None) extends ColumnRef with Product with Serializable

    Permalink

    References a function call *

  17. trait GettableByIndexData extends Serializable

    Permalink
  18. trait GettableData extends GettableByIndexData

    Permalink
  19. class PairRDDFunctions[K, V] extends Serializable

    Permalink
  20. class RDDFunctions[T] extends WritableToCassandra[T] with Serializable

    Permalink

    Provides Cassandra-specific methods on RDD

  21. case class RowsInBatch(batchSize: Int) extends BatchSize with Product with Serializable

    Permalink
  22. trait ScalaGettableByIndexData extends GettableByIndexData

    Permalink
  23. trait ScalaGettableData extends ScalaGettableByIndexData with GettableData

    Permalink
  24. case class SomeColumns(columns: ColumnRef*) extends ColumnSelector with Product with Serializable

    Permalink
  25. class SparkContextFunctions extends Serializable

    Permalink

    Provides Cassandra-specific methods on SparkContext

  26. case class TTL(columnName: String, alias: Option[String] = None) extends ColumnRef with Product with Serializable

    Permalink

    References TTL of a column.

  27. final case class TupleValue(values: Any*) extends ScalaGettableByIndexData with Product with Serializable

    Permalink
  28. final case class UDTValue(columnNames: IndexedSeq[String], columnValues: IndexedSeq[AnyRef]) extends ScalaGettableData with Product with Serializable

    Permalink
  29. case class WriteTime(columnName: String, alias: Option[String] = None) extends ColumnRef with Product with Serializable

    Permalink

    References write time of a column.

Value Members

  1. object AllColumns extends ColumnSelector with Product with Serializable

    Permalink
  2. object BatchSize

    Permalink
  3. object CassandraRow extends Serializable

    Permalink
  4. object CassandraRowMetadata extends Serializable

    Permalink
  5. object CollectionAppend extends CollectionBehavior with Product with Serializable

    Permalink
  6. object CollectionOverwrite extends CollectionBehavior with Product with Serializable

    Permalink
  7. object CollectionPrepend extends CollectionBehavior with Product with Serializable

    Permalink
  8. object CollectionRemove extends CollectionBehavior with Product with Serializable

    Permalink
  9. object DocUtil

    Permalink
  10. object GettableData extends Serializable

    Permalink
  11. object PartitionKeyColumns extends ColumnSelector with Product with Serializable

    Permalink
  12. object PrimaryKeyColumns extends ColumnSelector with Product with Serializable

    Permalink
  13. object RowCountRef extends ColumnRef with Product with Serializable

    Permalink

    References a row count value returned from SELECT count(*)

  14. object SomeColumns extends Serializable

    Permalink
  15. object TupleValue extends Serializable

    Permalink
  16. object UDTValue extends Serializable

    Permalink
  17. package cql

    Permalink

    Contains a cql.CassandraConnector object which is used to connect to a Cassandra cluster and to send CQL statements to it.

    Contains a cql.CassandraConnector object which is used to connect to a Cassandra cluster and to send CQL statements to it. CassandraConnector provides a Scala-idiomatic way of working with Cluster and Session object and takes care of connection pooling and proper resource disposal.

  18. package japi

    Permalink
  19. package mapper

    Permalink

    Provides machinery for mapping Cassandra tables to user defined Scala classes or tuples.

    Provides machinery for mapping Cassandra tables to user defined Scala classes or tuples. The main class in this package is mapper.ColumnMapper responsible for matching Scala object's properties with Cassandra column names.

  20. package rdd

    Permalink

    Contains com.datastax.spark.connector.rdd.CassandraTableScanRDD class that is the main entry point for analyzing Cassandra data from Spark.

  21. package streaming

    Permalink
  22. implicit def toCassandraTableScanFunctions[T](rdd: CassandraTableScanRDD[T]): CassandraTableScanRDDFunctions[T]

    Permalink
  23. implicit def toCassandraTableScanRDDPairFunctions[K, V](rdd: CassandraTableScanRDD[(K, V)]): CassandraTableScanPairRDDFunctions[K, V]

    Permalink
  24. implicit def toDataFrameFunctions(dataFrame: DataFrame): DatasetFunctions[Row]

    Permalink
  25. implicit def toDatasetFunctions[K](dataset: Dataset[K])(implicit arg0: Encoder[K]): DatasetFunctions[K]

    Permalink
  26. implicit def toNamedColumnRef(columnName: String): ColumnName

    Permalink
  27. implicit def toPairRDDFunctions[K, V](rdd: RDD[(K, V)]): PairRDDFunctions[K, V]

    Permalink
  28. implicit def toRDDFunctions[T](rdd: RDD[T]): RDDFunctions[T]

    Permalink
  29. implicit def toSparkContextFunctions(sc: SparkContext): SparkContextFunctions

    Permalink
  30. package types

    Permalink

    Offers type conversion magic, so you can receive Cassandra column values in a form you like the most.

    Offers type conversion magic, so you can receive Cassandra column values in a form you like the most. Simply specify the type you want to use on the Scala side, and the column value will be converted automatically. Works also with complex objects like collections.

  31. package util

    Permalink

    Useful stuff that didn't fit elsewhere.

  32. package writer

    Permalink

    Contains components for writing RDDs to Cassandra

Inherited from AnyRef

Inherited from Any

Ungrouped