Apache TinkerPop™ graph computing framework
Describe the Apache TinkerPop framework.
Apache TinkerPop™ is a graph abstraction layer that works with numerous different graph databases and graph processors. TinkerPop is composed of two elements: a structure API and a process API.
The primary components of the TinkerPop structure API are:
- Graph
- Maintains a set of vertices and edges.
- Vertex
- Extends a general class Element and maintains a set of incoming and outgoing edges as well as a collection of properties and a vertex type. DSE Graph schema stores VertexLabel - ID, name, Time-To-Live (TTL).
- Edge
- Extends Element and maintains an incoming and outgoing vertex as well as a collection of properties and an edge type. DSE Graph schema stores EdgeLabel - ID, name, TTL, multiplicity (multi, simple), unidirected, visible, sort-key.
- Property
- A string key associated with a value. DSE Graph schema stores PropertyKey - ID, name, TTL, datatype, cardinality (single, list).
- VertexProperty
- A string key associated with a value as well as a collection of metadata properties (vertices only).
The primary components of the TinkerPop process API are:
- TraversalSource
- A generator of traversals for a particular graph, domain specific language (DSL), and execution engine.
- Traversal<S,E>
- A functional data flow process transforming objects of type
S
into object of typeE
.
- GraphTraversal
- A traversal domain-specific language (DSL) that is oriented towards the semantics of the raw graph (i.e. vertices, edges, etc.).
- GraphComputer
- A system that processes the graph in parallel and potentially, distributed over a multi-machine cluster.
- VertexProgram
- Code executed at all vertices in a logically parallel manner with intercommunication via message passing.
- MapReduce
- Computations that analyzes all vertices in the graph in parallel and yields a single reduced result.
A key feature of TinkerPop is Gremlin, a graph traversal language and virtual machine. TinkerPop and Gremlin are to graph databases what JDBC and SQL are to relational databases. Gremlin variants are available for many languages: Java, Groovy, Python, and others.