DseGraphFrame

Instance Constructors

new DseGraphFrame(gf: GraphFrame, dseGraphName: Option[String] = None, graphOptions: Map[String, String] = Map.empty)

Abstract Value Members

abstract def deleteEdgeProperties(df: DataFrame, properties: String*): Unit

clean edges properties
clean edges properties
properties
delete only selected properties not entire row

Annotations
@varargs()
abstract def deleteEdges(df: DataFrame, cache: Boolean = true): Unit

delete graph edges.
delete graph edges. 4 id columns should be passed to the method
```
+--------------------+--------------------+-------+--------------------+
|                 src|                 dst| ~label|                  id|
+--------------------+--------------------+-------+--------------------+
|god:THxdAAAAAAAAAAAA|titan:J474AAAAAAA...| father|da0a9900-8fe1-11e...|
+--------------------+--------------------+-------+--------------------+
```
df
data frame with edge ids: src,dst,~label, id
cache
cache df before processing, true by default for consistence updates. two C* entries need to be deleted for one edge, so no reloads expected between this two calls.
abstract def deleteVertexProperties(df: DataFrame, properties: Seq[String], labels: Seq[String] = Seq.empty, cache: Boolean = true): Unit

clean vertex properties with meta properties
clean vertex properties with meta properties
properties
property names to delete
abstract def deleteVertices(label: String): Unit

delete all vertices with given label
abstract def deleteVertices(df: DataFrame, labels: Seq[String] = Seq.empty, cache: Boolean = true): Unit

delete vertices and all related edges
abstract def edgeIdColumnNames: Seq[String]
abstract def idColumn(labelColumn: Column, idColumns: Column*): Column

Utility method to generate GraphFrame compatible ids, if a mixed set of labels is in the DF.
Utility method to generate GraphFrame compatible ids, if a mixed set of labels is in the DF. It is slower than idColumn(label: String, idColumns: Column*): Column The id is added automatically when vertex is inserted, if inserted columns has the same names as in graph schema It is not possible for edges as you need to point both src and dst ids. Usage:
```
val updateEdgeDF = sourceDF.select(
  gf.idColumn(col("srcLabel"), col("srcId")) as "src",
  gf.idColumn(col("dstLabel"), col("dstId")) as "dst",
  col("label") as "~label",
  gf.randomEdgeIdColumn,
  col("property"))

gf.updateEdges(updateEdgeDF)
```
If different labels have different id format use case statement to sort them:
```
when(col("srcLabel") === "1format", col("src1Id")).when(col("srcLabel") === "2format", col("src2Id")).otherwise(col("src3Id")) as "src"
```
Annotations
@varargs()
abstract def idColumn(label: String, idColumns: Column*): Column

Utility method to generate GraphFrame compatible ids.
Utility method to generate GraphFrame compatible ids. The id is added automatically when vertex is inserted, if inserted columns has the same names as in graph schema It is not possible for edges as you need to point both src and dst ids. Usage:
```
val updateEdgeDF = sourceDF.select(
  gf.idColumn("srcLabel", col("srcId")) as "src",
  gf.idColumn("dstLabel", col("dstId")) as "dst",
  col("label") as "~label",
  gf.randomEdgeIdColumn,
  col("property"))

gf.updateEdges(updateEdgeDF)
```
Annotations
@varargs()
abstract def toExternalEdgeId(label: String, srcId: String, dstId: String, ids: Seq[Any], schema: StructType): AnyRef

label
Edge label
srcId
Source vertex id
dstId
Destination vertex id
ids
Edge ids
schema
Associated DataFrame schema
returns
External ID object
abstract def toExternalVertexId(id: String): AnyRef

id
String of vertex ID
returns
External ID object
abstract def updateEdges(outVertexLabel: String, edgeLabel: String, inVertexLabel: String, df: DataFrame): Unit

update or insert edges.
update or insert edges. this method accept natural vertex id columns. Id representation is implementation depended. Classic graph out vertex id column names should start with "out_" prefix and in names with "in_". Core graph uses DSE-DB edge table schema. the minimal df schema is: 2 id and 0 or more properties columns
```
+-----+------+-------------------+
|out_id|in_id|               prop|
+-----+------+-------------------+
|   10|     a|              value|
+-----+------+-------------------+
```
outVertexLabel->edgeLabel->inVertexLabel the df is not cached by the function. the dataframe should be persisted by the user if dynamic data source is used.
df
data frame with edge ids and update columns
abstract def updateEdges(df: DataFrame, cache: Boolean = true): Unit

update this graph edges.
update this graph edges. the minimal df schema is: 4 id columns and at least one property to update
```
+--------------------+--------------------+-------+--------------------+-------------------+
|                 src|                 dst| ~label|                  id|               prop|
+--------------------+--------------------+-------+--------------------+-------------------+
|god:THxdAAAAAAAAAAAA|titan:J474AAAAAAA...| father|da0a9900-8fe1-11e...|              value|
+--------------------+--------------------+-------+--------------------+-------------------+
```
if ID column is not present it will be generated and edges will be saved as new.
df
data frame with edge ids and update columns
cache
cache df before processing, true by default for consistence updates. two C* entries need to be updated for one edge, so no reloads expected between this two calls.
abstract def updateVertices(vertexLabel: String, df: DataFrame): Unit

update this graph vertices with properties provided in the df.
update this graph vertices with properties provided in the df. you should provide id in non encoded format
```
+-----------------+---------+---------+
|     community_id|member_id|      age|
+-----------------+---------+---------+
|       1182054400|        0|        0|
+-----------------+---------+---------+
```
the df is not cached by the function. the dataframe should be persisted by the user if dynamic data source is used.
vertexLabel
vertex label to update
df
dataframe with vertex ids and columns to update
abstract def updateVertices(df: DataFrame, labels: Seq[String] = Seq.empty, cache: Boolean = true): Unit

update this graph vertices with properties provided in the df.
update this graph vertices with properties provided in the df. the minimal df schema is just vertex "id" and one property to update:
```
+-----------------+---------+
|               id|      age|
+-----------------+---------+
|god:AAAAATMAAA...|        0|
+-----------------+---------+
```
label and vertices id will be extracted from the graph frame id. for better performance it is recommended to add/leave "~label" column
```
+-----------------+---------+---------+
|               id|   ~label|      age|
+-----------------+---------+---------+
|god:AAAAATMAAA...|      god|        0|
+-----------------+---------+---------+
```
you can also provide id in non encoded format
```
+-----------------+---------+---------+---------+
|     community_id|member_id|   ~label|      age|
+-----------------+---------+---------+---------+
|       1182054400|        0|      god|        0|
+-----------------+---------+---------+---------+
```
Note: passing both synthetic "id" and vertex Id columns is an error.
df
dataframe with vertex id and update columns
labels
empty (means all) by default, it is convenient to group vertexes with the same id format. That group could be passed here, to reduce number of verification steps
cache
cache df before processing, true by default for consistence update and performance

Concrete Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def E(edgesIds: AnyRef*): DseGraphTraversal[Edge]

Return graph traversal that supports a subset of TinkerPop3 traversal steps
Return graph traversal that supports a subset of TinkerPop3 traversal steps
edgesIds
to start traverse with
returns
GraphTraversal[Edge] for the filtered graph

Annotations
@varargs()
def E: DseGraphTraversal[Edge]

Return graph traversal that supports a subset of TinkerPop3 traversal steps
Return graph traversal that supports a subset of TinkerPop3 traversal steps
returns
GraphTraversal[Edge] for the graph
def V(vertexIds: AnyRef*): DseGraphTraversal[Vertex]

Return graph traversal that supports subset of TinkerPop3 traversal steps
Return graph traversal that supports subset of TinkerPop3 traversal steps
vertexIds
to start traverse with
returns
GraphTraversal[Vertex] for the filtered graph

Annotations
@varargs()
def V: DseGraphTraversal[Vertex]

Return graph traversal that supports subset of TinkerPop3 traversal steps
Return graph traversal that supports subset of TinkerPop3 traversal steps
returns
GraphTraversal[Vertex] for the graph
final def asInstanceOf[T0]: T0

Definition Classes
Any
def cache(): DseGraphFrame.this.type

proxy call to gf.cache()
proxy call to gf.cache()
returns
this
def cleanUp: String

Remove any invalid edge entries from the database backend.
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
val clusterName: String

Attributes
protected
lazy val connector: CassandraConnector

Attributes
protected
val connectorOptions: Map[String, String]

Attributes
protected
def deleteEdges(df: DataFrame): Unit

shortcut for deleteEdges(df: DataFrame, cache: Boolean = true) for Java
def deleteVertexProperties(df: DataFrame, properties: String*): Unit

clean vertex properties with meta properties
clean vertex properties with meta properties
properties
property names to delete

Annotations
@varargs()
def dropIsolatedVertices(): DseGraphFrame

proxy call to gf.dropIsolatedVertices()
proxy call to gf.dropIsolatedVertices()
returns
new filtered DseGraphFrame
var dseGraphName: Option[String]

Attributes
protected
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def filterEdges(conditionExpr: String): DseGraphFrame

proxy call to gf.filterEdges()
proxy call to gf.filterEdges()
returns
new filtered DseGraphFrame
def filterEdges(condition: Column): DseGraphFrame

proxy call to gf.filterEdges()
proxy call to gf.filterEdges()
returns
new filtered DseGraphFrame
def filterVertices(conditionExpr: String): DseGraphFrame

proxy call to gf.filterVertices()
proxy call to gf.filterVertices()
returns
new filtered DseGraphFrame
def filterVertices(condition: Column): DseGraphFrame

proxy call to gf.filterVertices()
proxy call to gf.filterVertices()
returns
new filtered DseGraphFrame
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
val gf: GraphFrame
def graphName: String

Returns the graph name of this DseGraphFrame.
Returns the graph name of this DseGraphFrame.

Exceptions thrown
NoSuchElementException if the graph name is not set.
val graphOptions: Map[String, String]
def hashCode(): Int

Definition Classes
AnyRef → Any
def io(url: String): DseGraphTraversal[Vertex]

Performs a read or write based operation on the Graph backing this GraphTraversalSource.
Performs a read or write based operation on the Graph backing this GraphTraversalSource. This step can be accompanied by the Object) modulator for further configuration and must be accompanied by a GraphTraversal#read() or GraphTraversal#write() modulator step which will terminate the traversal.
url
the url of file in distributed file system or JDBC connection or the name of file in default file system for which the read or write will apply - note that the context of how this parameter is used is wholly dependent on the implementation. i.e cassandra read/writer implementation will ignore this path and read table name from parameters.
returns
the traversal with the { @link IoStep} added
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
def persist(storageLevel: StorageLevel): DseGraphFrame.this.type

proxy call to gf.persist()
proxy call to gf.persist()
returns
this
def persist(): DseGraphFrame.this.type

proxy call to gf.persist()
proxy call to gf.persist()
returns
this
lazy val spark: SparkSession

Attributes
protected
lazy val sqlContext: SQLContext

Attributes
protected
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
def unpersist(blocking: Boolean): DseGraphFrame.this.type

proxy call to gf.unpersist()
proxy call to gf.unpersist()
returns
this
def unpersist(): DseGraphFrame.this.type

proxy call to gf.unpersist()
proxy call to gf.unpersist()
returns
this
def updateEdges(df: DataFrame): Unit

shortcut for updateEdges(df: DataFrame, cache: Boolean = true) for Java
def updateVertices(df: DataFrame): Unit

shortcut for updateVertices(df: DataFrame, labels: Seq[String] = Seq.empty, cache: Boolean = true) for Java API
shortcut for updateVertices(df: DataFrame, labels: Seq[String] = Seq.empty, cache: Boolean = true) for Java API
df
dataframe with vertex id and update columns
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
def withCachedDf[T](df: DataFrame, cache: Boolean)(code: ⇒ T): T

Attributes
protected

Related Docs: object DseGraphFrame | package graphframe

abstract class DseGraphFrame extends Serializable

Instance Constructors

new DseGraphFrame(gf: GraphFrame, dseGraphName: Option[String] = None, graphOptions: Map[String, String] = Map.empty)

Abstract Value Members

abstract def deleteEdgeProperties(df: DataFrame, properties: String*): Unit

abstract def deleteEdges(df: DataFrame, cache: Boolean = true): Unit

abstract def deleteVertexProperties(df: DataFrame, properties: Seq[String], labels: Seq[String] = Seq.empty, cache: Boolean = true): Unit

abstract def deleteVertices(label: String): Unit

abstract def deleteVertices(df: DataFrame, labels: Seq[String] = Seq.empty, cache: Boolean = true): Unit

abstract def edgeIdColumnNames: Seq[String]

abstract def idColumn(labelColumn: Column, idColumns: Column*): Column

abstract def idColumn(label: String, idColumns: Column*): Column

abstract def toExternalEdgeId(label: String, srcId: String, dstId: String, ids: Seq[Any], schema: StructType): AnyRef

abstract def toExternalVertexId(id: String): AnyRef

abstract def updateEdges(outVertexLabel: String, edgeLabel: String, inVertexLabel: String, df: DataFrame): Unit

abstract def updateEdges(df: DataFrame, cache: Boolean = true): Unit

abstract def updateVertices(vertexLabel: String, df: DataFrame): Unit

abstract def updateVertices(df: DataFrame, labels: Seq[String] = Seq.empty, cache: Boolean = true): Unit

Concrete Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

def E(edgesIds: AnyRef*): DseGraphTraversal[Edge]

def E: DseGraphTraversal[Edge]

def V(vertexIds: AnyRef*): DseGraphTraversal[Vertex]

def V: DseGraphTraversal[Vertex]

final def asInstanceOf[T0]: T0

def cache(): DseGraphFrame.this.type

def cleanUp: String

def clone(): AnyRef

val clusterName: String

lazy val connector: CassandraConnector

val connectorOptions: Map[String, String]

def deleteEdges(df: DataFrame): Unit

def deleteVertexProperties(df: DataFrame, properties: String*): Unit

def dropIsolatedVertices(): DseGraphFrame

var dseGraphName: Option[String]

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def filterEdges(conditionExpr: String): DseGraphFrame

def filterEdges(condition: Column): DseGraphFrame

def filterVertices(conditionExpr: String): DseGraphFrame

def filterVertices(condition: Column): DseGraphFrame

def finalize(): Unit

final def getClass(): Class[_]

val gf: GraphFrame

def graphName: String

val graphOptions: Map[String, String]

def hashCode(): Int

def io(url: String): DseGraphTraversal[Vertex]

final def isInstanceOf[T0]: Boolean

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

def persist(storageLevel: StorageLevel): DseGraphFrame.this.type

def persist(): DseGraphFrame.this.type

lazy val spark: SparkSession

lazy val sqlContext: SQLContext

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

def unpersist(blocking: Boolean): DseGraphFrame.this.type

def unpersist(): DseGraphFrame.this.type

def updateEdges(df: DataFrame): Unit

def updateVertices(df: DataFrame): Unit

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

def withCachedDf[T](df: DataFrame, cache: Boolean)(code: ⇒ T): T

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped