com.datastax.bdp.graph.spark.graphframe.classic
DSE Graph store 2 copy of each edge, as "IN" and "OUT", it also can "cache" partition one of the direction in local C* table, in this case only one copy of the edge available in general table.
DSE Graph store 2 copy of each edge, as "IN" and "OUT", it also can "cache" partition one of the direction in local C* table, in this case only one copy of the edge available in general table. Thus following algorithm is used to gather edges 1. If requested vertices has no partition edges or only "IN" vertices were partitioned, select all "OUT" vertices for given vertexLabels. 2. if only "OUT" vertices were partitioned select all "IN" vertices 3. in case both IN and OUT partition exists, select both and then call distinct on them, to remove duplicates that is slower on startup but spark-repartition edges DF for beter join performance in the future.
Note: If subset of vertex label was passed to the Builder, some edges could point to non-existent vertices.
GraphFrame compatible non-cached edge DataFrame
Create new DseGraphFrame from provided GraphFrame or load data from the DSE
Create new DseGraphFrame from provided GraphFrame or load data from the DSE
graph frame
provide additional spark cassandra connector options
examples:
builder.option("cluster", "ClusterOne")
builder.option("spark.cassandra.connection.host", "192.168.0.1")
provide additional spark cassandra connector options
examples:
builder.option("cluster", "ClusterOne")
builder.option("spark.cassandra.connection.host", "192.168.0.1")
cassandra connector option name
value
this
provide additional spark cassandra connector options
examples:
builder.options(Map ("cluster" -> "ClusterOne", "spark.cassandra.connection.host" -> "192.168.0.1")
provide additional spark cassandra connector options
examples:
builder.options(Map ("cluster" -> "ClusterOne", "spark.cassandra.connection.host" -> "192.168.0.1")
cassandra connector option map
this
GraphFrame compatible non-cached vertex DataFrame
limit edge labels to create a sub graph
limit edge labels to create a sub graph
provide data from external source the builder will only read schema from DSE
provide data from external source the builder will only read schema from DSE
limit vertex labels to create a sub graph
limit vertex labels to create a sub graph
Helper class to create GraphFrame from C* backend. The GraphFrame caches dataframes on creation. It is recommended to call withVertex method to create subgraph prior graph frame creation to reduce memory footprint
Usage: val dataFrame = DseGraphFrameBuilder("graph", spark).dseGraph()