Class/Object

com.datastax.bdp.graph.spark.graphframe

DseGraphFrame

Related Docs: object DseGraphFrame | package graphframe

Permalink

class DseGraphFrame extends Serializable

Provides DSEGraph-specific methods on GraphFrame It support graphName is needed for some traversal steps and to write data back It can be lost during DseGraphFrame->GraphFrame->DseGraphFrame implicit conversions set graphName tot he target graph if needed.

Linear Supertypes
Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DseGraphFrame
  2. Serializable
  3. Serializable
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new DseGraphFrame(gf: GraphFrame, _graphName: Option[String] = None, _graphSchema: Option[SerializableSchema] = None)

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. def E(): DseGraphTraversal[Edge]

    Permalink

    Returned graph traversal supports subset of TinkerPop3 traversal steps

    Returned graph traversal supports subset of TinkerPop3 traversal steps

    returns

    GraphTraversal[Edge] for the graph

  5. def V(): DseGraphTraversal[Vertex]

    Permalink

    Returned graph traversal supports subset of TinkerPop3 traversal steps

    Returned graph traversal supports subset of TinkerPop3 traversal steps

    returns

    GraphTraversal[Vertex] for the graph

  6. var _graphName: Option[String]

    Permalink
  7. var _graphSchema: Option[SerializableSchema]

    Permalink
    Attributes
    protected
  8. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  9. def cache(): DseGraphFrame.this.type

    Permalink

    proxy call to gf.cache()

    proxy call to gf.cache()

    returns

    this

  10. def cleanUp: String

    Permalink

    Remove any invalid vertex property and edge entries from the database backend.

    Remove any invalid vertex property and edge entries from the database backend. Call this method if you get internal errors or inconsistent results from any graph queries it is strongly recommended to run nodetool repair graphName before and then again after this call the call revises graph database storage and fixes following problems - delete vertex properties entries of non-existent vertex

    • delete vertex properties with unknown ids
    • delete edges with unknown/removed edge or vertex labels
    • delete edges that points to non-existent vertices
    • restore second copy of the edge
    • delete second copy of the edge if primary record exist
  11. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  12. def deleteEdgeProperties(df: DataFrame, properties: String*): Unit

    Permalink

    clean edges properties

    clean edges properties

    properties

    delete only selected properties not entire row

    Annotations
    @varargs()
  13. def deleteEdges(df: DataFrame, cache: Boolean = true): Unit

    Permalink

    delete graph edges.

    delete graph edges. 4 id columns should be passed to the method +--------------------+--------------------+-------+--------------------+ | src| dst| ~label| id| +--------------------+--------------------+-------+--------------------+ |god:THxdAAAAAAAAAAAA|titan:J474AAAAAAA...| father|da0a9900-8fe1-11e...| +--------------------+--------------------+-------+--------------------+

    df

    data frame with edge ids: src,dst,~label, id

    cache

    cache df before processing, true by default for consistence updates. two C* entries need to be deleted for one edge, so no reloads expected between this two calls.

  14. def deleteEdges(df: DataFrame): Unit

    Permalink

    shortcut for deleteEdges(df: DataFrame, cache: Boolean = true) for Java

  15. def deleteVertexProperties(df: DataFrame, properties: Seq[String], labels: Seq[String] = Seq.empty, cache: Boolean = true): Unit

    Permalink

    clean vertex properties with meta properties

    clean vertex properties with meta properties

    properties

    property names to delete

  16. def deleteVertexProperties(df: DataFrame, properties: String*): Unit

    Permalink

    clean vertex properties with meta properties

    clean vertex properties with meta properties

    properties

    property names to delete

    Annotations
    @varargs()
  17. def deleteVertices(label: String): Unit

    Permalink

    delete all vertices with given label

  18. def deleteVertices(df: DataFrame, labels: Seq[String] = Seq.empty, cache: Boolean = true): Unit

    Permalink

    delete vertices and all related edges

  19. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  20. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  21. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  22. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  23. val gf: GraphFrame

    Permalink
  24. def graphName: String

    Permalink
  25. def graphName_=(name: String): Unit

    Permalink

    restore or change the name of the graph

  26. def graphSchema: SerializableSchema

    Permalink

    Return schema of this graph base on it name NoSuchElementException will be thrown if graph name is unknown and schema can not be retrieved

    Return schema of this graph base on it name NoSuchElementException will be thrown if graph name is unknown and schema can not be retrieved

    returns

    Graph Schema

  27. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  28. def idColumn(labelColumn: Column, idColumns: Column*): Column

    Permalink

    Utility method to generate GraphFrame compatible ids, if a mixed set of labels is in the DF.

    Utility method to generate GraphFrame compatible ids, if a mixed set of labels is in the DF. It is slower then String, idColumns: Column*): Column The id is added automatically when vertex is inserted, if inserted columns has the same names as in graph schema It is not possible for edges as you need to point both src and dst ids. Usage: val updateEdgeDF = sourceDF.select (gf.idColumn(col("srcLabel"), col("srcId")) as src, gf.idColumn(col("dstLabel"), col("dstId")) ad dst, col("label") as "~label", gf.randomEdgeIdColumn, col("property")) gf.updateEdges(updateEdgeDF) If different labels have different id format use case statement to sort them: when(col("srcLabel") === "1format", col("src1Id")).when(col("srcLabel") === "2format", col("src2Id")).otherwise(col("src3Id")) as src

    Annotations
    @varargs()
  29. def idColumn(label: String, idColumns: Column*): Column

    Permalink

    Utility method to generate GraphFrame compatible ids.

    Utility method to generate GraphFrame compatible ids. The id is added automatically when vertex is inserted, if inserted columns has the same names as in graph schema It is not possible for edges as you need to point both src and dst ids. Usage: val updateEdgeDF = sourceDF.select (gf.idColumn("srcLabel", col("srcId")) ad src, gf.idColumn("dstLabel", col("dstId")) as dst, col("label") as "~label", gf.randomEdgeIdColumn, col("property")) gf.updateEdges(updateEdgeDF)

    Annotations
    @varargs()
  30. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  31. def nativeJavaTypeConverter(columnName: String): TypeConverter[_]

    Permalink
  32. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  33. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  34. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  35. def persist(storageLevel: StorageLevel): DseGraphFrame.this.type

    Permalink

    proxy call to gf.persist()

    proxy call to gf.persist()

    returns

    this

  36. def persist(): DseGraphFrame.this.type

    Permalink

    proxy call to gf.persist()

    proxy call to gf.persist()

    returns

    this

  37. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  38. def toExternalEdgeIdAsMap(id: AnyRef): Map[String, AnyRef]

    Permalink
  39. def toExternalVertexIdAsMap(id: AnyRef): Map[String, AnyRef]

    Permalink
  40. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  41. def toSyntheticVertexId(id: AnyRef): String

    Permalink
  42. def unpersist(blocking: Boolean): DseGraphFrame.this.type

    Permalink

    proxy call to gf.unpersist()

    proxy call to gf.unpersist()

    returns

    this

  43. def unpersist(): DseGraphFrame.this.type

    Permalink

    proxy call to gf.unpersist()

    proxy call to gf.unpersist()

    returns

    this

  44. def updateEdges(outVertexLabel: String, edgeLabel: String, inVertexLabel: String, df: DataFrame): Unit

    Permalink

    update this graph edges.

    update this graph edges. this method accept natural vertex id columns. Out vertex column names should start with "out_" prefix and in names with "in_". The method will update only one triplet combination. the minimal df schema is: 2 id columns and 0 or more properties columns +-----+------+--------------------+-------------------+ |out_id|in_id| id| prop| +-----+------+--------------------+-------------------+ | 10| a|da0a9900-8fe1-11e...| value| +-----+------+--------------------+-------------------+

    id column should contains UUID(0,0).toString() value for single edges and pre-generated UUID for mutli-cardinality edges outVertexLabel->edgeLabel->inVertexLabel is passed as parameters. the df is not cached by the function. the dataframe should be persisted by the user if dynamic data source is used.

    df

    data frame with edge ids and update columns

  45. def updateEdges(df: DataFrame, cache: Boolean = true): Unit

    Permalink

    update this graph edges.

    update this graph edges. the minimal df schema is: 4 id columns and at least one property to update +--------------------+--------------------+-------+--------------------+-------------------+ | src| dst| ~label| id| prop| +--------------------+--------------------+-------+--------------------+-------------------+ |god:THxdAAAAAAAAAAAA|titan:J474AAAAAAA...| father|da0a9900-8fe1-11e...| value| +--------------------+--------------------+-------+--------------------+-------------------+

    if ID column is not present it will be generated and edges will be saved as new.

    df

    data frame with edge ids and update columns

    cache

    cache df before processing, true by default for consistence updates. two C* entries need to be updated for one edge, so no reloads expected between this two calls.

  46. def updateEdges(df: DataFrame): Unit

    Permalink

    shortcut for updateEdges(df: DataFrame, cache: Boolean = true) for Java

  47. def updateVertices(vertexLabel: String, df: DataFrame): Unit

    Permalink

    update this graph vertices with properties provided in the df.

    update this graph vertices with properties provided in the df. you should provide id in non encoded format +-----------------+---------+---------+ | community_id|member_id| age| +-----------------+---------+---------+ | 1182054400| 0| 0| +-----------------+---------+---------+ the df is not cached by the function.

    vertexLabel

    to update

    df

    dataframe with vertex id and update columns

  48. def updateVertices(df: DataFrame, labels: Seq[String] = Seq.empty, cache: Boolean = true): Unit

    Permalink

    update this graph vertices with properties provided in the df.

    update this graph vertices with properties provided in the df. the minimal df schema is just vertex "id" and one property to update: +-----------------+---------+ | id| age| +-----------------+---------+ |god:AAAAATMAAA...| 0| +-----------------+---------+ label and vertices id will be extracted from the graph frame id. for better performance it is recommended to add/leave "~label" column +-----------------+---------+---------+ | id| ~label| age| +-----------------+---------+---------+ |god:AAAAATMAAA...| god| 0| +-----------------+---------+---------+ you can also provide id in non encoded format +-----------------+---------+---------+---------+ | community_id|member_id| ~label| age| +-----------------+---------+---------+---------+ | 1182054400| 0| god| 0| +-----------------+---------+---------+---------+ Note: passing both synthetic "id" and vertex Id columns is an error.

    df

    dataframe with vertex id and update columns

    labels

    empty (means all) by default, it is convenient to group vertexes with the same id format. That group could be passed here, to reduce number of verification steps

    cache

    cache df before processing, true by default for consistence update and performance

  49. def updateVertices(df: DataFrame): Unit

    Permalink

    shortcut for updateVertices(df: DataFrame, labels: Seq[String] = Seq.empty, cache: Boolean = true) for Java API

    shortcut for updateVertices(df: DataFrame, labels: Seq[String] = Seq.empty, cache: Boolean = true) for Java API

    df

    dataframe with vertex id and update columns

  50. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  51. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  52. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped