com.datastax.bdp.graph.spark.graphframe
DSE Graph store 2 copy of each edge, as "IN" and "OUT", it also can "cache" partition one of the direction in local C* table, in this case only one copy of the edge available in general table.
DSE Graph store 2 copy of each edge, as "IN" and "OUT", it also can "cache" partition one of the direction in local C* table, in this case only one copy of the edge available in general table. Thus following algorithm is used to gather edges 1. If requested vertices has no partition edges or only "IN" vertices were partitioned, select all "OUT" vertices for given vertexLabels. 2. if only "OUT" vertices were partitioned select all "IN" vertices 3. in case both IN and OUT partition exists, select both and then call distinct on them, to remove duplicates that is slower on startup but spark-repartition edges DF for beter join performance in the future.
Note: If subset of vertex label was passed to the Builder, some edges could point to non-existent vertices.
GraphFrame compatible non-cached edge DataFrame
graph frame
GraphFrame compatible non-cached vertex DataFrame
Sub graph that contains only vertexes with given labels.
Sub graph that contains only vertexes with given labels.
Helper class to create GraphFrame from C* backend. The GraphFrame caches dataframes on creation. It is recommended to call withVertex method to create subgraph prior graph frame creation to reduce memory footprint
Usage: val dataFrame = DseGraphFrameBuilder("graph", spark).dseGraph()