Data model introduction
Brief introduction to the elements of a property graph.
Graph databases represent simple or complex relationships between objects. The objects can be any tangible or intangible objects, such as people, software, or locations. The analysis of the interaction of objects with one another and their environment can yield useful results. Unlike relational databases, graph databases are designed to make discovering the relationships between objects directly queryable.
- vertex
- A vertex defines an object such as a person, location, or recipe; think of a vertex as a
noun. Each vertex has an unique identifier, and a single label that denotes the type of entity
the vertex represents. DataStax recommends setting the unique identifier. Properties identify
the partition keys and clustering keys to locate physical storage location, as well as
additional non-primary key attributes. Two vertices are adjacent to one another if they
share an edge, whereas a vertex is incident to an edge. Vertices may have zero to many
defined properties. For scale, the number of vertices can number in the billions. CAUTION: DSE Graph limits the number of vertex labels to 200 per graph.
- edge
- An edge defines a directional binary relationship, or connection, between two vertices. A person can create software, or a person can write a book; think of edges as verbs. Each edge in DSE Graph has an unique autogenerated identifier, and a single label that denotes the type of connection the edge represents. Because edge identifiers are autogenerated, multiple edges with the same direction and label can connect the same two vertices and is the default for defined edges. For convenience, DSE Graph stores each unique edge in both directions. Edges may have zero to many defined properties. Edges are incident to a vertex. For scale, the number of edges can number in the billions.
- property
- A property defines the attributes of vertices, edges, or another property; they consist of a property key and a property value. Properties can have multiple values, called multi-properties, or be a property of another property, called a meta-property. All properties are global in DSE Graph, so a property key, such as name can be used for more than one element. A property has a single value by default in DSE Graph, but multi-properties can be defined for vertices. Properties are not mandatory for vertices or edges, although DataStax recommends setting vertex identifiers with properties that define the partition keys and clustering keys.
An important concept to be aware of is the nature of vertices and edges as addressable elements. Only vertices are globally addressable. Vertex labels are specified with a primary key consisting of partition keys and clustering keys that define the storage location. In contrast, edges are only locally addressable as adjacent to a vertex. Not all queries are defined with primary keys, so indexing plays a critical role in querying graphs. A data model must also define the indexes used to find data in a graph using non-primary key information. DSE Graph takes advantage of three built-in mechanisms to create indexes: materialized views (MVs), secondary indexes, and search indexes. Some implications of addressibility are that queries start with vertices and can traverse to edges, but querying edges themselves is not a viable option. Additionally, meta-properties of vertices can be indexed, meta-properties of edges cannot.
Each of these elements, as well as indexes, play a role in a well-design data model and the performance of graph queries.