Creating a graph database schema using Groovy in the Gremlin console.
Creating a data model for a graph database is
the critical first step towards creating a schema. Once the data model is designed
and a graph is created, defining the schema for the vertices and edges and their
properties is the next step in creating a graph database. Gremlin-Groovy is packaged
with the Apache TinkerPop Gremlin console. Use either Gremlin-Groovy to create a script
that contains the Gremlin commands, or enter the commands directly into the Gremlin
console.
Procedure
-
Start the Gremlin
console.
-
Create a new graph to store the
data and alias a graph traversal to run queries. If you are reusing a graph that
you previously created, drop the graph
schema and data.
- Optional:
On the remote Gremlin Server, set the
timeout
value to
max
to prevent client-side timeouts. Use this setting to
ensure that script processing will complete.
gremlin> :remote config timeout max
- Optional:
If running large scripts, set the
evaluation_timeout
value to
max
to prevent server-side timeouts. Use this setting to
ensure that script processing will complete.
graph.schema().config().option("graph.traversal_sources.g.evaluation_timeout").set("PT10M")
-
A script that creates schema is shown in the Example at the bottom of this
page. The script file is loaded and run on the remote Gremlin Server. This
script will not work if you have previously run scripts from the Quick Start unless the schema and any data has been cleared
from the graph.
gremlin> :load /tmp/RecipeSchema.groovy
-
The following steps show the details of the full script broken down into
sections.
-
Define the properties for the vertices and the edges. The data type of the
property is specified in addition to a key name. All properties created in this
example are Text, Integers, or Timestamps. Other data types are available. Properties
will be used to retrieve selective subsets of the graph and to retrieve stored
values. Properties are global in nature, and the pairing of a vertex label and a
property will uniquely identify a property for use in traversals. Edge
properties are expensive to update, as they are deleted and recreated, so use
edge properties in situations that warrant their use.
// Property Keys
// Check for previous creation of property key with ifNotExists()
schema.propertyKey('name').Text().ifNotExists().create()
schema.propertyKey('gender').Text().create()
schema.propertyKey('instructions').Text().create()
schema.propertyKey('category').Text().create()
schema.propertyKey('year').Int().create()
schema.propertyKey('timestamp').Timestamp().create()
schema.propertyKey('ISBN').Text().create()
schema.propertyKey('calories').Int().create()
schema.propertyKey('amount').Text().create()
schema.propertyKey('stars').Int().create()
// single() is optional, as it is the default
schema.propertyKey('comment').Text().single().create()
// Example of a multiple property that can have several values
// Next 4 lines define two properties, then create a meta-property 'livedIn' on 'country'
// A meta-property is a property of a property
// EX: 'livedIn': '1999-2005' 'country': 'Belgium'
// schema.propertyKey('nickname').Text().multiple().create()
// schema.propertyKey('country').Text().create()
// schema.propertyKey('livedIn').Text().create()
// schema.propertyKey('country').Text().properties('livedIn').create()
Property keys can be checked for prior existence with
ifNotExists()
. Property keys can be created with either
single or multiple cardinality with single()
or
multiple()
. The default is single cardinality which
does not have to be specified, but it can be explicitly stated as in the
example.
Meta-properties, or properties of properties, can be created using
propertyKey()
followed by
properties()
. The property key must exist prior to the
creation of a meta-property.
-
Define the vertex labels. The vertex labels identify the type of vertices that
can be created.
/// Vertex Labels
schema.vertexLabel('author').ifNotExists().create()
schema.vertexLabel('recipe').create()
// Example of creating vertex label with properties
// schema.vertexLabel('recipe').properties('name','instructions').create()
schema.vertexLabel('ingredient').create()
schema.vertexLabel('book').create()
schema.vertexLabel('meal').create()
schema.vertexLabel('reviewer').create()
// Example of custom vertex id:
// schema.propertyKey('city_id').Int().create()
// schema.propertyKey('sensor_id').Uuid().create()
// schema().vertexLabel('FridgeSensor').partitionKey('city_id').clusteringKey('sensor_id').create()
Vertex labels can be checked for prior existence using
ifNotExists()
. Vertex labels can be created along with
properties. Vertex labels can be created with
custom vertex ids, rather than the
standard vertex ids.
Note: Standard auto-generated ids are
deprecated with DSE 6.0.
Custom ids
will undergo changes, and specifying vertex ids with
partitionKey
and
clusteringKey
will likely
become the normal method.
DSE Graph limits the number of vertex labels to 200 per graph.
-
Define the edge labels. The edge labels identify the type of edges that can be
created.
// Edge Labels
schema.edgeLabel('authored').ifNotExists().create()
schema.edgeLabel('created').create()
schema.edgeLabel('includes').create()
schema.edgeLabel('includedIn').create()
schema.edgeLabel('rated').properties('stars').connection('reviewer','recipe').create()
Edge labels can be checked for prior existence using
ifNotExists()
. Edge labels can be created with
adjacent vertex labels identified using connection().
-
Define indexes that can speed up the query
processing. All types of indexes are presented here.
// Vertex Indexes
// Secondary
schema.vertexLabel('author').index('byName').secondary().by('name').add()
// Materialized
schema.vertexLabel('recipe').index('byRecipe').materialized().by('name').add()
schema.vertexLabel('meal').index('byMeal').materialized().by('name').add()
schema.vertexLabel('ingredient').index('byIngredient').materialized().by('name').add()
schema.vertexLabel('reviewer').index('byReviewer').materialized().by('name').add()
// Search
// schema.vertexLabel('recipe').index('search').search().by('instructions').asText().add()
// schema.vertexLabel('recipe').index('search').search().by('instructions').asString().add()
// If more than one property key is search indexed
// schema.vertexLabel('recipe').index('search').search().by('instructions').asText().by('category').asString().add()
// Edge Index
schema.vertexLabel('reviewer').index('ratedByStars').outE('rated').by('stars').add()
// Example of property index using meta-property 'livedIn':
// schema().vertexLabel('author').index('byLocation').property('country').by('livedIn').add()
-
After creating the graph schema, examine the schema to verify. This command is
included as the last command of the full script.
==>schema.propertyKey("member_id").Smallint().single().create()
schema.propertyKey("instructions").Text().single().create()
schema.propertyKey("amount").Text().single().create()
schema.propertyKey("gender").Text().single().create()
schema.propertyKey("year").Int().single().create()
schema.propertyKey("calories").Int().single().create()
schema.propertyKey("stars").Int().single().create()
schema.propertyKey("community_id").Int().single().create()
schema.propertyKey("ISBN").Text().single().create()
schema.propertyKey("name").Text().single().create()
schema.propertyKey("comment").Text().single().create()
schema.propertyKey("category").Text().single().create()
schema.propertyKey("timestamp").Timestamp().single().create()
schema.edgeLabel("authored").multiple().create()
schema.edgeLabel("rated").multiple().properties("stars").create()
schema.edgeLabel("includedIn").multiple().create()
schema.edgeLabel("created").multiple().create()
schema.edgeLabel("includes").multiple().create()
schema.vertexLabel("meal").properties("name").create()
schema.vertexLabel("meal").index("byMeal").materialized().by("name").add()
schema.vertexLabel("ingredient").properties("name").create()
schema.vertexLabel("ingredient").index("byIngredient").materialized().by("name").add()
schema.vertexLabel("author").properties("name").create()
schema.vertexLabel("author").index("byName").secondary().by("name").add()
schema.vertexLabel("book").create()
schema.vertexLabel("recipe").properties("name").create()
schema.vertexLabel("recipe").index("byRecipe").materialized().by("name").add()
schema.vertexLabel("reviewer").properties("name").create()
schema.vertexLabel("reviewer").index("byReviewer").materialized().by("name").add()
schema.vertexLabel("reviewer").index("ratedByStars").outE("rated").by("stars").add()
schema.edgeLabel("rated").connection("recipe", "reviewer").connection("reviewer", "recipe").add()
Example
// RECIPE SCHEMA
// To run in Studio, copy and paste all lines to a cell and run.
// To run in Gremlin console, use the load command
// :load /tmp/RecipeSchema.groovy
// Property Keys
// Check for previous creation of property key with ifNotExists()
schema.propertyKey('name').Text().ifNotExists().create()
schema.propertyKey('gender').Text().create()
schema.propertyKey('instructions').Text().create()
schema.propertyKey('category').Text().create()
schema.propertyKey('year').Int().create()
schema.propertyKey('timestamp').Timestamp().create()
schema.propertyKey('ISBN').Text().create()
schema.propertyKey('calories').Int().create()
schema.propertyKey('amount').Text().create()
schema.propertyKey('stars').Int().create()
// single() is optional, as it is the default
schema.propertyKey('comment').Text().single().create()
// Example of a multiple property that can have several values
// Next 4 lines define two properties, then create a meta-property 'livedIn' on 'country'
// A meta-property is a property of a property
// EX: 'livedIn': '1999-2005' 'country': 'Belgium'
// schema.propertyKey('nickname').Text().multiple().create()
// schema.propertyKey('country').Text().create()
// schema.propertyKey('livedIn').Text().create()
// schema.propertyKey('country').Text().properties('livedIn').create()
// Vertex Labels
schema.vertexLabel('author').ifNotExists().create()
schema.vertexLabel('recipe').create()
// Example of creating vertex label with properties
// schema.vertexLabel('recipe').properties('name','instructions').create()
schema.vertexLabel('ingredient').create()
schema.vertexLabel('book').create()
schema.vertexLabel('meal').create()
schema.vertexLabel('reviewer').create()
// Example of custom vertex id:
// schema.propertyKey('city_id').Int().create()
// schema.propertyKey('sensor_id').Uuid().create()
// schema().vertexLabel('FridgeSensor').partitionKey('city_id').clusteringKey('sensor_id').create()
// Edge Labels
schema.edgeLabel('authored').ifNotExists().create()
schema.edgeLabel('created').create()
schema.edgeLabel('includes').create()
schema.edgeLabel('includedIn').create()
schema.edgeLabel('rated').properties('stars').connection('reviewer','recipe').create()
// Vertex Indexes
// Secondary
schema.vertexLabel('author').index('byName').secondary().by('name').add()
// Materialized
schema.vertexLabel('recipe').index('byRecipe').materialized().by('name').add()
schema.vertexLabel('meal').index('byMeal').materialized().by('name').add()
schema.vertexLabel('ingredient').index('byIngredient').materialized().by('name').add()
schema.vertexLabel('reviewer').index('byReviewer').materialized().by('name').add()
// Search
// schema.vertexLabel('recipe').index('search').search().by('instructions').asText().add()
// schema.vertexLabel('recipe').index('search').search().by('instructions').asString().add()
// If more than one property key is search indexed
// schema.vertexLabel('recipe').index('search').search().by('instructions').asText().by('category').asString().add()
// Edge Index
schema.vertexLabel('reviewer').index('ratedByStars').outE('rated').by('stars').add()
// Property index using meta-property 'livedIn':
schema.vertexLabel('author').index('byLocation').property('country').by('livedIn').add()
// Schema description
// Use to check that the schema is built as desired
schema.describe()