Creating graph schema
Creating graph database schema.
Creating a data model for a graph database is
the critical first step towards creating a schema. Once the data model is designed
and a graph is created, defining the schema for the vertices and edges and their
properties is the next step in creating a graph database. Gremlin-Groovy is the
language used to create scripts; Gremlin-Groovy is packaged with the Apache
TinkerPop engine, and can be used with either DataStax Studio or the
Gremlin console (dse gremlin-console
) installed with DataStax
Enterprise.
Graph schema can be created with create()
or added to existing
schema with add()
.
Prerequisites
Procedure
- Optional. If you are reusing a graph that you previously created, drop the graph schema and data.
-
Optional. If running large scripts in Gremlin console, set the
timeout
value tomax
to prevent client-side time outs. Use this setting to ensure that script processing will complete. This step cannot be completed in Studio.gremlin> :remote config timeout max
-
Optional. If running large scripts, set the
evaluation_timeout
value tomax
to prevent server-side timeouts. Use this setting to ensure that script processing will complete.graph.schema().config().option("graph.traversal_sources.g.evaluation_timeout").set("PT10M")
Important: Setting a timeout value greater than 1095 days (maximum integer) can exceed the limit of a graph session. Starting a new session and setting the timeout to a lower value can recover access to a hung session. This caution is applicable for all timeouts: evaluation_timeout, system_evaluation_timeout, analytic_evaluation_timeout, and realtime_evaluation_timeout -
Load the example schema listed in the Example below:
NOTE: Each command submitted is within a single session, so from cell to cell (Studio) or line to line (Gremlin console), the Gremlin server is not aware of any variables set on the previous line. If any of the lines in the Recipe Schema are entered separately, an error will occur on the edge creation commands.
- The following steps show the details of the full script broken down into sections.
-
Define the properties for the vertices and the edges. The data type of the
property is specified in addition to a key name. All properties created in this
example are Text, Integers, or Timestamps. Other data types are available. Properties
will be used to retrieve selective subsets of the graph and to retrieve stored
values. Properties are global in nature, and the pairing of a vertex label and a
property will uniquely identify a property for use in traversals. Edge
properties are expensive to update, as because the whole edge with all its
properties are deleted and recreated to update edge properties. Use edge
properties only in situations that warrant their use.
// Property Keys // Check for previous creation of property key with ifNotExists() schema.propertyKey('name').Text().ifNotExists().create() schema.propertyKey('gender').Text().create() schema.propertyKey('instructions').Text().create() schema.propertyKey('category').Text().create() schema.propertyKey('year').Int().create() schema.propertyKey('timestamp').Timestamp().create() schema.propertyKey('ISBN').Text().create() schema.propertyKey('calories').Int().create() schema.propertyKey('amount').Text().create() schema.propertyKey('stars').Int().create() schema.propertyKey('comment').Text().single().create() // single() is optional - default // Example of multiple property // schema.propertyKey('nickname').Text().multiple().create(); // Example meta-property added to property: // schema.propertyKey('livedIn').Text().create() // schema.propertyKey('country').Text().multiple().properties('livedIn').create()
Property keys can be checked for prior existence with
ifNotExists()
. Property keys can be created with either single or multiple cardinality withsingle()
ormultiple()
. The default is single cardinality which does not have to be specified, but it can be explicitly stated as in the example.Meta-properties, or properties of properties, can be created usingpropertyKey()
followed byproperties()
. The property key must exist prior to the creation of a meta-property. Meta-properties cannot be nested, i.e., a meta-property cannot have a meta-property. In this example,country
is the property that has a meta-propertylivedIn
. This property and meta-property are used to represent the countries that anauthor
has lived in at various times in their life.{ "name":"Julia Child", "gender":"F", [ {"country": "United States", "livedIn": "1929-1949" }, {"country": "France", "livedIn": "1949-1952" } ], "authored":[{ "book":{ "label":"book", "bookTitle":"Art of French Cooking Volume One", "publishDate":1968 }, "book":{ "label":"book", "bookTitle":"The French Chef Cookbook", "publishDate":1968, "ISBN": "0-394-40135-2" } }], "created": [{ "type" : "recipe", "recipeTitle" : "BeefBourguignon", "instructions" : "Braise the beef.", "createDate":1967 }, { "type" : "recipe", "recipeTitle" : "Salade Nicoise", "instructions" : "Break the lettuce into pieces.", "createDate": 1970 } ] }
-
Define the vertex labels. The vertex labels identify the type of vertices that
can be created.
// Vertex Labels schema.vertexLabel('author').ifNotExists().create() schema.vertexLabel('recipe').create() // Example of creating vertex label with properties // schema.vertexLabel('recipe').properties('name','instructions').create() schema.vertexLabel('ingredient').create() schema.vertexLabel('book').create() schema.vertexLabel('meal').create() schema.vertexLabel('reviewer').create() // Example of custom vertex id: // schema.propertyKey('city_id').Int().create() // schema.propertyKey('sensor_id').Uuid().create() // schema().vertexLabel('FridgeSensor').partitionKey('city_id').clusteringKey('sensor_id').create()
Vertex labels can be checked for prior existence usingifNotExists()
. Vertex labels can be created along with properties. Vertex labels can be created with user-defined vertex ids, rather than the autogenerated vertex ids.Note: Auto-generated vertex ids are deprecated with DSE 6.0.CAUTION: DSE Graph limits the number of vertex labels to 200 per graph. -
Define the edge labels. The edge labels identify the type of edges that can be
created.
// Edge Labels schema.edgeLabel('authored').ifNotExists().create() schema.edgeLabel('created').create() schema.edgeLabel('includes').create() schema.edgeLabel('includedIn').create() schema.edgeLabel('rated').properties('rating').connection('reviewer','recipe').create()
Edge labels can be checked for prior existence using
ifNotExists()
. Edge labels can be created with adjacent vertex labels identified usingconnection()
. Edge labels can identify properties for an edge usingproperties()
. -
Define indexes that can speed up the query processing. All types of indexes are
presented here. Indexing has more information.
// Vertex Indexes // Secondary schema.vertexLabel('author').index('byName').secondary().by('name').add() // Materialized schema.vertexLabel('recipe').index('byRecipe').materialized().by('name').add() schema.vertexLabel('meal').index('byMeal').materialized().by('name').add() schema.vertexLabel('ingredient').index('byIngredient').materialized().by('name').add() schema.vertexLabel('reviewer').index('byReviewer').materialized().by('name').add() // Search // schema.vertexLabel('recipe').index('search').search().by('instructions').asText().add() // schema.vertexLabel('recipe').index('search').search().by('instructions').asString().add() // If more than one property key is search indexed // schema.vertexLabel('recipe').index('search').search().by('instructions').asText().by('category').asString().add() // Edge Index schema.vertexLabel('reviewer').index('ratedByStars').outE('rated').by('stars').add() // Example of property index using meta-property 'livedIn': // schema.vertexLabel('author').index('byLocation').property('country').by('livedIn').add()
These indexes are included to make the schema for the food example more efficient for data loading.
Note: The difference betweencreate()
andadd()
is subtle but important. If an entity (vertex label or edge label) has been created and already exists, if an index or property keys are associated with the entity, then anadd()
command is used. For example, a vertex label and property keys can be created, and then the property keys can be added to the vertex label. -
After creating the graph schema, examine the schema to verify. A portion of the
output is shown.
schema.describe()
Example
// RECIPE SCHEMA
// To run in Studio, copy and paste all lines to a cell and run.
// To run in Gremlin console, use the next two lines:
// script = new File('/tmp/RecipeSchema.groovy').text; []
// :> @script
// Property Keys
// Check for previous creation of property key with ifNotExists()
schema.propertyKey('name').Text().ifNotExists().create()
schema.propertyKey('gender').Text().create()
schema.propertyKey('instructions').Text().create()
schema.propertyKey('category').Text().create()
schema.propertyKey('year').Int().create()
schema.propertyKey('timestamp').Timestamp().create()
schema.propertyKey('ISBN').Text().create()
schema.propertyKey('calories').Int().create()
schema.propertyKey('amount').Text().create()
schema.propertyKey('stars').Int().create()
schema.propertyKey('comment').Text().single().create() // single() is optional - default
// Example of multiple property
// schema.propertyKey('nickname').Text().multiple().create();
// Example meta-property added to property:
// schema.propertyKey('livedIn').Text().create()
// schema.propertyKey('country').Text().properties('livedIn').create()
// Vertex Labels
schema.vertexLabel('author').ifNotExists().create()
schema.vertexLabel('recipe').create()
// Example of creating vertex label with properties
// schema.vertexLabel('recipe').properties('name','instructions').create()
schema.vertexLabel('ingredient').create()
schema.vertexLabel('book').create()
schema.vertexLabel('meal').create()
schema.vertexLabel('reviewer').create()
// Example of custom vertex id:
// schema.propertyKey('city_id').Int().create()
// schema.propertyKey('sensor_id').Uuid().create()
// schema().vertexLabel('FridgeSensor').partitionKey('city_id').clusteringKey('sensor_id').create()
// Edge Labels
schema.edgeLabel('authored').ifNotExists().create()
schema.edgeLabel('created').create()
schema.edgeLabel('includes').create()
schema.edgeLabel('includedIn').create()
schema.edgeLabel('rated').properties('stars').connection('reviewer','recipe').create()
// Vertex Indexes
// Secondary
schema.vertexLabel('author').index('byName').secondary().by('name').add()
// Materialized
schema.vertexLabel('recipe').index('byRecipe').materialized().by('name').add()
schema.vertexLabel('meal').index('byMeal').materialized().by('name').add()
schema.vertexLabel('ingredient').index('byIngredient').materialized().by('name').add()
schema.vertexLabel('reviewer').index('byReviewer').materialized().by('name').add()
// Search
// schema.vertexLabel('recipe').index('search').search().by('instructions').asText().add()
// schema.vertexLabel('recipe').index('search').search().by('instructions').asString().add()
// If more than one property key is search indexed
// schema.vertexLabel('recipe').index('search').search().by('instructions').asText().by('category').asString().add()
// Edge Index
schema.vertexLabel('reviewer').index('ratedByStars').outE('rated').by('stars').add()
// Example of property index using meta-property 'livedIn':
// schema().vertexLabel('author').index('byLocation').property('country').by('livedIn').add()
// Schema description
// Use to check that the schema is built as desired
schema.describe()