Class/Object

com.datastax.spark.connector.rdd

CassandraTableScanRDD

Related Docs: object CassandraTableScanRDD | package rdd

Permalink

class CassandraTableScanRDD[R] extends CassandraRDD[R] with CassandraTableRowReaderProvider[R] with SplitSizeEstimator[R]

RDD representing a Table Scan of A Cassandra table.

This class is the main entry point for analyzing data in Cassandra database with Spark. Obtain objects of this class by calling com.datastax.spark.connector.SparkContextFunctions.cassandraTable.

Configuration properties should be passed in the SparkConf configuration of SparkContext. CassandraRDD needs to open connection to Cassandra, therefore it requires appropriate connection property values to be present in SparkConf. For the list of required and available properties, see CassandraConnector.

CassandraRDD divides the data set into smaller partitions, processed locally on every cluster node. A data partition consists of one or more contiguous token ranges. To reduce the number of roundtrips to Cassandra, every partition is fetched in batches.

The following properties control the number of partitions and the fetch size: - spark.cassandra.input.split.sizeInMB: approx amount of data to be fetched into a single Spark partition, default 512 MB - spark.cassandra.input.fetch.sizeInRows: number of CQL rows fetched per roundtrip, default 1000

A CassandraRDD object gets serialized and sent to every Spark Executor, which then calls the compute method to fetch the data on every node. The getPreferredLocations method tells Spark the preferred nodes to fetch a partition from, so that the data for the partition are at the same node the task was sent to. If Cassandra nodes are collocated with Spark nodes, the queries are always sent to the Cassandra process running on the same node as the Spark Executor process, hence data are not transferred between nodes. If a Cassandra node fails or gets overloaded during read, the queries are retried to a different node.

By default, reads are performed at ConsistencyLevel.LOCAL_ONE in order to leverage data-locality and minimize network traffic. This read consistency level is controlled by the spark.cassandra.input.consistency.level property.

Linear Supertypes
SplitSizeEstimator[R], CassandraTableRowReaderProvider[R], CassandraRDD[R], RDD[R], Logging, Serializable, Serializable, AnyRef, Any
Known Subclasses
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. CassandraTableScanRDD
  2. SplitSizeEstimator
  3. CassandraTableRowReaderProvider
  4. CassandraRDD
  5. RDD
  6. Logging
  7. Serializable
  8. Serializable
  9. AnyRef
  10. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. type Self = CassandraTableScanRDD[R]

    Permalink

    This is slightly different than Scala this.type.

    This is slightly different than Scala this.type. this.type is the unique singleton type of an object which is not compatible with other instances of the same type, so returning anything other than this is not really possible without lying to the compiler by explicit casts. Here SelfType is used to return a copy of the object - a different instance of the same type

    Definition Classes
    CassandraTableScanRDDCassandraRDD

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. def ++(other: RDD[R]): RDD[R]

    Permalink
    Definition Classes
    RDD
  4. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  5. def aggregate[U](zeroValue: U)(seqOp: (U, R) ⇒ U, combOp: (U, U) ⇒ U)(implicit arg0: ClassTag[U]): U

    Permalink
    Definition Classes
    RDD
  6. def as[B, A0, A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11](f: (A0, A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1], arg3: TypeConverter[A2], arg4: TypeConverter[A3], arg5: TypeConverter[A4], arg6: TypeConverter[A5], arg7: TypeConverter[A6], arg8: TypeConverter[A7], arg9: TypeConverter[A8], arg10: TypeConverter[A9], arg11: TypeConverter[A10], arg12: TypeConverter[A11]): CassandraRDD[B]

    Permalink
    Definition Classes
    CassandraRDD
  7. def as[B, A0, A1, A2, A3, A4, A5, A6, A7, A8, A9, A10](f: (A0, A1, A2, A3, A4, A5, A6, A7, A8, A9, A10) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1], arg3: TypeConverter[A2], arg4: TypeConverter[A3], arg5: TypeConverter[A4], arg6: TypeConverter[A5], arg7: TypeConverter[A6], arg8: TypeConverter[A7], arg9: TypeConverter[A8], arg10: TypeConverter[A9], arg11: TypeConverter[A10]): CassandraRDD[B]

    Permalink
    Definition Classes
    CassandraRDD
  8. def as[B, A0, A1, A2, A3, A4, A5, A6, A7, A8, A9](f: (A0, A1, A2, A3, A4, A5, A6, A7, A8, A9) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1], arg3: TypeConverter[A2], arg4: TypeConverter[A3], arg5: TypeConverter[A4], arg6: TypeConverter[A5], arg7: TypeConverter[A6], arg8: TypeConverter[A7], arg9: TypeConverter[A8], arg10: TypeConverter[A9]): CassandraRDD[B]

    Permalink
    Definition Classes
    CassandraRDD
  9. def as[B, A0, A1, A2, A3, A4, A5, A6, A7, A8](f: (A0, A1, A2, A3, A4, A5, A6, A7, A8) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1], arg3: TypeConverter[A2], arg4: TypeConverter[A3], arg5: TypeConverter[A4], arg6: TypeConverter[A5], arg7: TypeConverter[A6], arg8: TypeConverter[A7], arg9: TypeConverter[A8]): CassandraRDD[B]

    Permalink
    Definition Classes
    CassandraRDD
  10. def as[B, A0, A1, A2, A3, A4, A5, A6, A7](f: (A0, A1, A2, A3, A4, A5, A6, A7) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1], arg3: TypeConverter[A2], arg4: TypeConverter[A3], arg5: TypeConverter[A4], arg6: TypeConverter[A5], arg7: TypeConverter[A6], arg8: TypeConverter[A7]): CassandraRDD[B]

    Permalink
    Definition Classes
    CassandraRDD
  11. def as[B, A0, A1, A2, A3, A4, A5, A6](f: (A0, A1, A2, A3, A4, A5, A6) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1], arg3: TypeConverter[A2], arg4: TypeConverter[A3], arg5: TypeConverter[A4], arg6: TypeConverter[A5], arg7: TypeConverter[A6]): CassandraRDD[B]

    Permalink
    Definition Classes
    CassandraRDD
  12. def as[B, A0, A1, A2, A3, A4, A5](f: (A0, A1, A2, A3, A4, A5) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1], arg3: TypeConverter[A2], arg4: TypeConverter[A3], arg5: TypeConverter[A4], arg6: TypeConverter[A5]): CassandraRDD[B]

    Permalink
    Definition Classes
    CassandraRDD
  13. def as[B, A0, A1, A2, A3, A4](f: (A0, A1, A2, A3, A4) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1], arg3: TypeConverter[A2], arg4: TypeConverter[A3], arg5: TypeConverter[A4]): CassandraRDD[B]

    Permalink
    Definition Classes
    CassandraRDD
  14. def as[B, A0, A1, A2, A3](f: (A0, A1, A2, A3) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1], arg3: TypeConverter[A2], arg4: TypeConverter[A3]): CassandraRDD[B]

    Permalink
    Definition Classes
    CassandraRDD
  15. def as[B, A0, A1, A2](f: (A0, A1, A2) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1], arg3: TypeConverter[A2]): CassandraRDD[B]

    Permalink
    Definition Classes
    CassandraRDD
  16. def as[B, A0, A1](f: (A0, A1) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1]): CassandraRDD[B]

    Permalink
    Definition Classes
    CassandraRDD
  17. def as[B, A0](f: (A0) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0]): CassandraRDD[B]

    Permalink

    Maps each row into object of a different type using provided function taking column value(s) as argument(s).

    Maps each row into object of a different type using provided function taking column value(s) as argument(s). Can be used to convert each row to a tuple or a case class object:

    sc.cassandraTable("ks", "table")
      .select("column1")
      .as((s: String) => s)                 // yields CassandraRDD[String]
    
    sc.cassandraTable("ks", "table")
      .select("column1", "column2")
      .as((_: String, _: Long))             // yields CassandraRDD[(String, Long)]
    
    case class MyRow(key: String, value: Long)
    sc.cassandraTable("ks", "table")
      .select("column1", "column2")
      .as(MyRow)                            // yields CassandraRDD[MyRow]
    Definition Classes
    CassandraRDD
  18. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  19. def barrier(): RDDBarrier[R]

    Permalink
    Definition Classes
    RDD
    Annotations
    @Experimental() @Since( "2.4.0" )
  20. def cache(): CassandraTableScanRDD.this.type

    Permalink
    Definition Classes
    RDD
  21. def cartesian[U](other: RDD[U])(implicit arg0: ClassTag[U]): RDD[(R, U)]

    Permalink
    Definition Classes
    RDD
  22. def cassandraCount(): Long

    Permalink

    Counts the number of items in this RDD by selecting count(*) on Cassandra table

    Counts the number of items in this RDD by selecting count(*) on Cassandra table

    Definition Classes
    CassandraTableScanRDDCassandraRDD
  23. lazy val cassandraPartitionerClassName: String

    Permalink
    Attributes
    protected
    Definition Classes
    CassandraTableRowReaderProvider
  24. def checkColumnsExistence(columns: Seq[ColumnRef]): Seq[ColumnRef]

    Permalink
    Attributes
    protected
    Definition Classes
    CassandraTableRowReaderProvider
  25. def checkpoint(): Unit

    Permalink
    Definition Classes
    RDD
  26. implicit val classTag: ClassTag[R]

    Permalink
  27. def clearDependencies(): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    RDD
  28. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  29. def clusteringOrder(order: ClusteringOrder): Self

    Permalink

    Adds a CQL ORDER BY clause to the query.

    Adds a CQL ORDER BY clause to the query. It can be applied only in case there are clustering columns and primary key predicate is pushed down in where. It is useful when the default direction of ordering rows within a single Cassandra partition needs to be changed.

    Definition Classes
    CassandraRDD
  30. val clusteringOrder: Option[ClusteringOrder]

    Permalink
    Definition Classes
    CassandraTableScanRDDCassandraRDD
  31. def coalesce(numPartitions: Int, shuffle: Boolean = false, partitionCoalescer: Option[PartitionCoalescer])(implicit ord: Ordering[R] = null): RDD[R]

    Permalink

    This method overrides the default spark behavior and will not create a CoalesceRDD.

    This method overrides the default spark behavior and will not create a CoalesceRDD. Instead it will reduce the number of partitions by adjusting the partitioning of C* data on read. Using this method will override spark.cassandra.input.split.size. The method is useful with where() method call, when actual size of data is smaller then the table size. It has no effect if a partition key is used in where clause.

    numPartitions

    number of partitions

    shuffle

    whether to call shuffle after

    partitionCoalescer

    is ignored if no shuffle, or just passed to shuffled CoalesceRDD

    returns

    new CassandraTableScanRDD with predefined number of partitions

    Definition Classes
    CassandraTableScanRDD → RDD
  32. def collect[U](f: PartialFunction[R, U])(implicit arg0: ClassTag[U]): RDD[U]

    Permalink
    Definition Classes
    RDD
  33. def collect(): Array[R]

    Permalink
    Definition Classes
    RDD
  34. val columnNames: ColumnSelector

    Permalink
  35. def compute(split: Partition, context: TaskContext): Iterator[R]

    Permalink
    Definition Classes
    CassandraTableScanRDD → RDD
  36. val connector: CassandraConnector

    Permalink
  37. def consistencyLevel: ConsistencyLevel

    Permalink
    Attributes
    protected
    Definition Classes
    CassandraTableRowReaderProvider
  38. def context: SparkContext

    Permalink
    Definition Classes
    RDD
  39. def convertTo[B](implicit arg0: ClassTag[B], arg1: RowReaderFactory[B]): CassandraTableScanRDD[B]

    Permalink
    Attributes
    protected
    Definition Classes
    CassandraTableScanRDDCassandraRDD
  40. def copy(columnNames: ColumnSelector = columnNames, where: CqlWhereClause = where, limit: Option[CassandraLimit] = limit, clusteringOrder: Option[ClusteringOrder] = None, readConf: ReadConf = readConf, connector: CassandraConnector = connector): Self

    Permalink

    Allows to copy this RDD with changing some of the properties

    Allows to copy this RDD with changing some of the properties

    Attributes
    protected
    Definition Classes
    CassandraTableScanRDDCassandraRDD
  41. def count(): Long

    Permalink
    Definition Classes
    RDD
  42. def countApprox(timeout: Long, confidence: Double): PartialResult[BoundedDouble]

    Permalink
    Definition Classes
    RDD
  43. def countApproxDistinct(relativeSD: Double): Long

    Permalink
    Definition Classes
    RDD
  44. def countApproxDistinct(p: Int, sp: Int): Long

    Permalink
    Definition Classes
    RDD
  45. def countByValue()(implicit ord: Ordering[R]): Map[R, Long]

    Permalink
    Definition Classes
    RDD
  46. def countByValueApprox(timeout: Long, confidence: Double)(implicit ord: Ordering[R]): PartialResult[Map[R, BoundedDouble]]

    Permalink
    Definition Classes
    RDD
  47. final def dependencies: Seq[Dependency[_]]

    Permalink
    Definition Classes
    RDD
  48. def distinct(): RDD[R]

    Permalink
    Definition Classes
    RDD
  49. def distinct(numPartitions: Int)(implicit ord: Ordering[R]): RDD[R]

    Permalink
    Definition Classes
    RDD
  50. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  51. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  52. def estimateSplitCount(splitSize: Long): Int

    Permalink
    Definition Classes
    SplitSizeEstimator
  53. def fetchSize: Int

    Permalink
    Attributes
    protected
    Definition Classes
    CassandraTableRowReaderProvider
  54. def filter(f: (R) ⇒ Boolean): RDD[R]

    Permalink
    Definition Classes
    RDD
  55. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  56. def first(): R

    Permalink
    Definition Classes
    RDD
  57. def firstParent[U](implicit arg0: ClassTag[U]): RDD[U]

    Permalink
    Attributes
    protected[org.apache.spark]
    Definition Classes
    RDD
  58. def flatMap[U](f: (R) ⇒ TraversableOnce[U])(implicit arg0: ClassTag[U]): RDD[U]

    Permalink
    Definition Classes
    RDD
  59. def fold(zeroValue: R)(op: (R, R) ⇒ R): R

    Permalink
    Definition Classes
    RDD
  60. def foreach(f: (R) ⇒ Unit): Unit

    Permalink
    Definition Classes
    RDD
  61. def foreachPartition(f: (Iterator[R]) ⇒ Unit): Unit

    Permalink
    Definition Classes
    RDD
  62. def getCheckpointFile: Option[String]

    Permalink
    Definition Classes
    RDD
  63. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  64. def getDependencies: Seq[Dependency[_]]

    Permalink
    Attributes
    protected
    Definition Classes
    RDD
  65. final def getNumPartitions: Int

    Permalink
    Definition Classes
    RDD
    Annotations
    @Since( "1.6.0" )
  66. def getOutputDeterministicLevel: org.apache.spark.rdd.DeterministicLevel.Value

    Permalink
    Attributes
    protected
    Definition Classes
    RDD
    Annotations
    @DeveloperApi()
  67. def getPartitions: Array[Partition]

    Permalink
    Definition Classes
    CassandraTableScanRDD → RDD
  68. def getPreferredLocations(split: Partition): Seq[String]

    Permalink
    Definition Classes
    CassandraTableScanRDD → RDD
  69. def getStorageLevel: StorageLevel

    Permalink
    Definition Classes
    RDD
  70. def glom(): RDD[Array[R]]

    Permalink
    Definition Classes
    RDD
  71. def groupBy[K](f: (R) ⇒ K, p: Partitioner)(implicit kt: ClassTag[K], ord: Ordering[K]): RDD[(K, Iterable[R])]

    Permalink
    Definition Classes
    RDD
  72. def groupBy[K](f: (R) ⇒ K, numPartitions: Int)(implicit kt: ClassTag[K]): RDD[(K, Iterable[R])]

    Permalink
    Definition Classes
    RDD
  73. def groupBy[K](f: (R) ⇒ K)(implicit kt: ClassTag[K]): RDD[(K, Iterable[R])]

    Permalink
    Definition Classes
    RDD
  74. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  75. val id: Int

    Permalink
    Definition Classes
    RDD
  76. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  77. def initializeLogIfNecessary(isInterpreter: Boolean): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  78. def intersection(other: RDD[R], numPartitions: Int): RDD[R]

    Permalink
    Definition Classes
    RDD
  79. def intersection(other: RDD[R], partitioner: Partitioner)(implicit ord: Ordering[R]): RDD[R]

    Permalink
    Definition Classes
    RDD
  80. def intersection(other: RDD[R]): RDD[R]

    Permalink
    Definition Classes
    RDD
  81. lazy val isBarrier_: Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    RDD
  82. def isCheckpointed: Boolean

    Permalink
    Definition Classes
    RDD
  83. def isEmpty(): Boolean

    Permalink
    Definition Classes
    RDD
  84. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  85. def isTraceEnabled(): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  86. final def iterator(split: Partition, context: TaskContext): Iterator[R]

    Permalink
    Definition Classes
    RDD
  87. def keyBy[K]()(implicit classtag: ClassTag[K], rrf: RowReaderFactory[K], rwf: RowWriterFactory[K]): CassandraTableScanRDD[(K, R)]

    Permalink

    Extracts a key of the given class from all the available columns.

    Extracts a key of the given class from all the available columns.

    See also

    keyBy(ColumnSelector)

  88. def keyBy[K](columns: ColumnRef*)(implicit classtag: ClassTag[K], rrf: RowReaderFactory[K], rwf: RowWriterFactory[K]): CassandraTableScanRDD[(K, R)]

    Permalink

    Extracts a key of the given class from the given columns.

    Extracts a key of the given class from the given columns.

    See also

    keyBy(ColumnSelector)

  89. def keyBy[K](columns: ColumnSelector)(implicit classtag: ClassTag[K], rrf: RowReaderFactory[K], rwf: RowWriterFactory[K]): CassandraTableScanRDD[(K, R)]

    Permalink

    Selects a subset of columns mapped to the key and returns an RDD of pairs.

    Selects a subset of columns mapped to the key and returns an RDD of pairs. Similar to the builtin Spark keyBy method, but this one uses implicit RowReaderFactory to construct the key objects. The selected columns must be available in the CassandraRDD.

    If the selected columns contain the complete partition key a CassandraPartitioner will also be created.

    columns

    column selector passed to the rrf to create the row reader, useful when the key is mapped to a tuple or a single value

  90. def keyBy[K](f: (R) ⇒ K): RDD[(K, R)]

    Permalink
    Definition Classes
    RDD
  91. val keyspaceName: String

    Permalink
  92. def limit(rowLimit: Long): Self

    Permalink

    Adds the limit clause to CQL select statement.

    Adds the limit clause to CQL select statement. The limit will be applied for each created Spark partition. In other words, unless the data are fetched from a single Cassandra partition the number of results is unpredictable.

    The main purpose of passing limit clause is to fetch top n rows from a single Cassandra partition when the table is designed so that it uses clustering keys and a partition key predicate is passed to the where clause.

    Definition Classes
    CassandraRDD
  93. val limit: Option[CassandraLimit]

    Permalink
    Definition Classes
    CassandraTableScanRDDCassandraRDD
  94. def localCheckpoint(): CassandraTableScanRDD.this.type

    Permalink
    Definition Classes
    RDD
  95. def log: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  96. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  97. def logDebug(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  98. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  99. def logError(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  100. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  101. def logInfo(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  102. def logName: String

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  103. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  104. def logTrace(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  105. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  106. def logWarning(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  107. def map[U](f: (R) ⇒ U)(implicit arg0: ClassTag[U]): RDD[U]

    Permalink
    Definition Classes
    RDD
  108. def mapPartitions[U](f: (Iterator[R]) ⇒ Iterator[U], preservesPartitioning: Boolean)(implicit arg0: ClassTag[U]): RDD[U]

    Permalink
    Definition Classes
    RDD
  109. def mapPartitionsWithIndex[U](f: (Int, Iterator[R]) ⇒ Iterator[U], preservesPartitioning: Boolean)(implicit arg0: ClassTag[U]): RDD[U]

    Permalink
    Definition Classes
    RDD
  110. def max()(implicit ord: Ordering[R]): R

    Permalink
    Definition Classes
    RDD
  111. def min()(implicit ord: Ordering[R]): R

    Permalink
    Definition Classes
    RDD
  112. var name: String

    Permalink
    Definition Classes
    RDD
  113. def narrowColumnSelection(columns: Seq[ColumnRef]): Seq[ColumnRef]

    Permalink

    Filters currently selected set of columns with a new set of columns

    Filters currently selected set of columns with a new set of columns

    Definition Classes
    CassandraTableRowReaderProvider
  114. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  115. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  116. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  117. def parent[U](j: Int)(implicit arg0: ClassTag[U]): RDD[U]

    Permalink
    Attributes
    protected[org.apache.spark]
    Definition Classes
    RDD
  118. lazy val partitionGenerator: CassandraPartitionGenerator[V, T]

    Permalink
  119. val partitioner: Option[Partitioner]

    Permalink
    Definition Classes
    CassandraTableScanRDD → RDD
  120. final def partitions: Array[Partition]

    Permalink
    Definition Classes
    RDD
  121. def perPartitionLimit(rowLimit: Long): Self

    Permalink

    Adds the PER PARTITION LIMIT clause to CQL select statement.

    Adds the PER PARTITION LIMIT clause to CQL select statement. The limit will be applied for every Cassandra Partition. Only Valid For Cassandra 3.6+

    Definition Classes
    CassandraRDD
  122. def persist(): CassandraTableScanRDD.this.type

    Permalink
    Definition Classes
    RDD
  123. def persist(newLevel: StorageLevel): CassandraTableScanRDD.this.type

    Permalink
    Definition Classes
    RDD
  124. def pipe(command: Seq[String], env: Map[String, String], printPipeContext: ((String) ⇒ Unit) ⇒ Unit, printRDDElement: (R, (String) ⇒ Unit) ⇒ Unit, separateWorkingDir: Boolean, bufferSize: Int, encoding: String): RDD[String]

    Permalink
    Definition Classes
    RDD
  125. def pipe(command: String, env: Map[String, String]): RDD[String]

    Permalink
    Definition Classes
    RDD
  126. def pipe(command: String): RDD[String]

    Permalink
    Definition Classes
    RDD
  127. final def preferredLocations(split: Partition): Seq[String]

    Permalink
    Definition Classes
    RDD
  128. def randomSplit(weights: Array[Double], seed: Long): Array[RDD[R]]

    Permalink
    Definition Classes
    RDD
  129. val readConf: ReadConf

    Permalink
  130. def reduce(f: (R, R) ⇒ R): R

    Permalink
    Definition Classes
    RDD
  131. def repartition(numPartitions: Int)(implicit ord: Ordering[R]): RDD[R]

    Permalink
    Definition Classes
    RDD
  132. lazy val rowReader: RowReader[R]

    Permalink
  133. implicit val rowReaderFactory: RowReaderFactory[R]

    Permalink

    RowReaderFactory and ClassTag should be provided from implicit parameters in the constructor of the class implementing this trait

    RowReaderFactory and ClassTag should be provided from implicit parameters in the constructor of the class implementing this trait

    Definition Classes
    CassandraTableScanRDDCassandraTableRowReaderProvider
    See also

    CassandraTableScanRDD

  134. def sample(withReplacement: Boolean, fraction: Double, seed: Long): RDD[R]

    Permalink
    Definition Classes
    RDD
  135. def saveAsObjectFile(path: String): Unit

    Permalink
    Definition Classes
    RDD
  136. def saveAsTextFile(path: String, codec: Class[_ <: CompressionCodec]): Unit

    Permalink
    Definition Classes
    RDD
  137. def saveAsTextFile(path: String): Unit

    Permalink
    Definition Classes
    RDD
  138. val sc: SparkContext

    Permalink
  139. def select(columns: ColumnRef*): Self

    Permalink

    Narrows down the selected set of columns.

    Narrows down the selected set of columns. Use this for better performance, when you don't need all the columns in the result RDD. When called multiple times, it selects the subset of the already selected columns, so after a column was removed by the previous select call, it is not possible to add it back.

    The selected columns are ColumnRef instances. This type allows to specify columns for straightforward retrieval and to read TTL or write time of regular columns as well. Implicit conversions included in com.datastax.spark.connector package make it possible to provide just column names (which is also backward compatible) and optional add .ttl or .writeTime suffix in order to create an appropriate ColumnRef instance.

    Definition Classes
    CassandraRDD
  140. def selectedColumnNames: Seq[String]

    Permalink
    Definition Classes
    CassandraRDD
  141. lazy val selectedColumnRefs: Seq[ColumnRef]

    Permalink

    Returns the columns to be selected from the table.

    Returns the columns to be selected from the table.

    Definition Classes
    CassandraTableRowReaderProvider
  142. def setName(_name: String): CassandraTableScanRDD.this.type

    Permalink
    Definition Classes
    RDD
  143. def sortBy[K](f: (R) ⇒ K, ascending: Boolean, numPartitions: Int)(implicit ord: Ordering[K], ctag: ClassTag[K]): RDD[R]

    Permalink
    Definition Classes
    RDD
  144. def sparkContext: SparkContext

    Permalink
    Definition Classes
    RDD
  145. def splitCount: Option[Int]

    Permalink
    Attributes
    protected
    Definition Classes
    CassandraTableRowReaderProvider
  146. def splitSize: Long

    Permalink
    Attributes
    protected[com.datastax.spark.connector]
    Definition Classes
    CassandraTableRowReaderProvider
  147. def subtract(other: RDD[R], p: Partitioner)(implicit ord: Ordering[R]): RDD[R]

    Permalink
    Definition Classes
    RDD
  148. def subtract(other: RDD[R], numPartitions: Int): RDD[R]

    Permalink
    Definition Classes
    RDD
  149. def subtract(other: RDD[R]): RDD[R]

    Permalink
    Definition Classes
    RDD
  150. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  151. lazy val tableDef: TableDef

    Permalink
  152. val tableName: String

    Permalink
  153. def take(num: Int): Array[R]

    Permalink
    Definition Classes
    CassandraRDD → RDD
  154. def takeOrdered(num: Int)(implicit ord: Ordering[R]): Array[R]

    Permalink
    Definition Classes
    RDD
  155. def takeSample(withReplacement: Boolean, num: Int, seed: Long): Array[R]

    Permalink
    Definition Classes
    RDD
  156. def toDebugString: String

    Permalink
    Definition Classes
    RDD
  157. def toEmptyCassandraRDD: EmptyCassandraRDD[R]

    Permalink
    Definition Classes
    CassandraTableScanRDDCassandraRDD
  158. def toJavaRDD(): JavaRDD[R]

    Permalink
    Definition Classes
    RDD
  159. def toLocalIterator: Iterator[R]

    Permalink
    Definition Classes
    RDD
  160. def toString(): String

    Permalink
    Definition Classes
    RDD → AnyRef → Any
  161. implicit lazy val tokenFactory: TokenFactory[V, T]

    Permalink
    Definition Classes
    SplitSizeEstimator
  162. def top(num: Int)(implicit ord: Ordering[R]): Array[R]

    Permalink
    Definition Classes
    RDD
  163. def treeAggregate[U](zeroValue: U)(seqOp: (U, R) ⇒ U, combOp: (U, U) ⇒ U, depth: Int)(implicit arg0: ClassTag[U]): U

    Permalink
    Definition Classes
    RDD
  164. def treeReduce(f: (R, R) ⇒ R, depth: Int): R

    Permalink
    Definition Classes
    RDD
  165. def union(other: RDD[R]): RDD[R]

    Permalink
    Definition Classes
    RDD
  166. def unpersist(blocking: Boolean): CassandraTableScanRDD.this.type

    Permalink
    Definition Classes
    RDD
  167. def verify(): RowReader[R]

    Permalink

    Checks for existence of keyspace and table.

    Checks for existence of keyspace and table.

    Definition Classes
    CassandraTableRowReaderProvider
  168. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  169. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  170. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  171. def where(cql: String, values: Any*): Self

    Permalink

    Adds a CQL WHERE predicate(s) to the query.

    Adds a CQL WHERE predicate(s) to the query. Useful for leveraging secondary indexes in Cassandra. Implicitly adds an ALLOW FILTERING clause to the WHERE clause, however beware that some predicates might be rejected by Cassandra, particularly in cases when they filter on an unindexed, non-clustering column.

    Definition Classes
    CassandraRDD
  172. val where: CqlWhereClause

    Permalink
    Definition Classes
    CassandraTableScanRDDCassandraRDD
  173. def withAscOrder: Self

    Permalink
    Definition Classes
    CassandraRDD
  174. def withConnector(connector: CassandraConnector): Self

    Permalink

    Returns a copy of this Cassandra RDD with specified connector

    Returns a copy of this Cassandra RDD with specified connector

    Definition Classes
    CassandraRDD
  175. def withDescOrder: Self

    Permalink
    Definition Classes
    CassandraRDD
  176. def withReadConf(readConf: ReadConf): Self

    Permalink

    Allows to set custom read configuration, e.g.

    Allows to set custom read configuration, e.g. consistency level or fetch size.

    Definition Classes
    CassandraRDD
  177. def zip[U](other: RDD[U])(implicit arg0: ClassTag[U]): RDD[(R, U)]

    Permalink
    Definition Classes
    RDD
  178. def zipPartitions[B, C, D, V](rdd2: RDD[B], rdd3: RDD[C], rdd4: RDD[D])(f: (Iterator[R], Iterator[B], Iterator[C], Iterator[D]) ⇒ Iterator[V])(implicit arg0: ClassTag[B], arg1: ClassTag[C], arg2: ClassTag[D], arg3: ClassTag[V]): RDD[V]

    Permalink
    Definition Classes
    RDD
  179. def zipPartitions[B, C, D, V](rdd2: RDD[B], rdd3: RDD[C], rdd4: RDD[D], preservesPartitioning: Boolean)(f: (Iterator[R], Iterator[B], Iterator[C], Iterator[D]) ⇒ Iterator[V])(implicit arg0: ClassTag[B], arg1: ClassTag[C], arg2: ClassTag[D], arg3: ClassTag[V]): RDD[V]

    Permalink
    Definition Classes
    RDD
  180. def zipPartitions[B, C, V](rdd2: RDD[B], rdd3: RDD[C])(f: (Iterator[R], Iterator[B], Iterator[C]) ⇒ Iterator[V])(implicit arg0: ClassTag[B], arg1: ClassTag[C], arg2: ClassTag[V]): RDD[V]

    Permalink
    Definition Classes
    RDD
  181. def zipPartitions[B, C, V](rdd2: RDD[B], rdd3: RDD[C], preservesPartitioning: Boolean)(f: (Iterator[R], Iterator[B], Iterator[C]) ⇒ Iterator[V])(implicit arg0: ClassTag[B], arg1: ClassTag[C], arg2: ClassTag[V]): RDD[V]

    Permalink
    Definition Classes
    RDD
  182. def zipPartitions[B, V](rdd2: RDD[B])(f: (Iterator[R], Iterator[B]) ⇒ Iterator[V])(implicit arg0: ClassTag[B], arg1: ClassTag[V]): RDD[V]

    Permalink
    Definition Classes
    RDD
  183. def zipPartitions[B, V](rdd2: RDD[B], preservesPartitioning: Boolean)(f: (Iterator[R], Iterator[B]) ⇒ Iterator[V])(implicit arg0: ClassTag[B], arg1: ClassTag[V]): RDD[V]

    Permalink
    Definition Classes
    RDD
  184. def zipWithIndex(): RDD[(R, Long)]

    Permalink
    Definition Classes
    RDD
  185. def zipWithUniqueId(): RDD[(R, Long)]

    Permalink
    Definition Classes
    RDD

Inherited from SplitSizeEstimator[R]

Inherited from CassandraRDD[R]

Inherited from RDD[R]

Inherited from Logging

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped