QuickStart Graph schema
Working with graph schema.
About this task
Before adding more data to the graph, let’s stop and talk about schemas, which define both vertex labels and edge labels along with their associated properties and data types. User-defined types (UDTs) are a feature of DSG and must be created before use as a data type in schema. Schema creation also includes index creation. Indexes play an important role in making graph traversals efficient and fast. See creating schema and creating indexes for more information.
Schema can also be modified after creation, adding properties to vertex labels or edge labels with addProperty()
and alter()
.
Let’s create additional schema for the food graph. These steps demonstrate dropping and creating schema.
As with all queries in Graph, if you are using Gremlin console, alias the graph traversal g to a graph with |
Procedure
-
Drop the previously-created schema. The new schema could be added, but this step ensures that no leftover schema or data corrupts the graph.
schema.drop()
Dropping schema will drop all data associated with the schema. Schema can be also be dropped individually for vertex labels, edge labels, properties and indexes.
-
User-defined type (UDT) schema
-
Create some useful UDTs for the vertex labels in the next step:
// Create user-defined types (UDTs) with Gremlin // VERTEX LABELS // ******** // SYNTAX: // schema.type('typename') // [ .ifNotExists() ] // [ .property(property, propertyType) ] // [ .create() | .describe() ] // USER-DEFINED TYPE // START-createUDT_address schema.type('address'). ifNotExists(). property('address1', Text). property('address2', Text). property('city_code', Text). property('state_code', Text). property('zip_code', Text). create() // END-createUDT_address // START-createUDT_fullname schema.type('fullname'). ifNotExists(). property('firstname', Text). property('lastname', Text). create() // END-createUDT_fullname //Using a nested user defined type via typeOf: // START-createUDT_locDet schema.type('location_details'). ifNotExists(). property('loc_address', frozen(typeOf('address'))). property('telephone', listOf(Text)). create() // END-createUDT_locDet
Nested UDTs are acceptable in DataStax Graph, like the use of
address
inlocation_details
.The results of the UDT creation can be examined with:
schema.types().describe()
The Gremlin console result:
==>schema.type('address').ifNotExists().property('address1', Varchar).property('address2', Varchar).property('city_code', Varchar).property('state_code', Varchar).property('zip_code', Varchar).create() schema.type('fullname').ifNotExists().property('firstname', Varchar).property('lastname', Varchar).create() schema.type('location_details').ifNotExists().property('loc_address', frozen(typeOf('address'))).property('telephone', listOf(Varchar)).create()
-
Vertex label schema
-
Create all the vertex labels for the food graph:
// VERTEX LABELS // ******** // SYNTAX: // schema.vertexLabel('vertexLabel') // [ .ifNotExists() ] // .partitionBy('propertyName', propertyType) [ ... ] // [ .clusterBy('propertyName', propertyType) ... ] // [ .property('propertyName', propertyType) ] // [ .create() | .describe() | .addProperty('propertyName', propertyType).alter() ] // SINGLE PARTITION KEY Vertex Labels // macro_goal is a list of carbohydrate, protein, fat // country is a list of tuple of country, start date, end date; replacement for a meta-property in classic graph // Also, country demonstrates multi-property, being a list of countries and dates lived in // country, start_date, end_date // badge is a replacement for a meta-property in earlier versions // level:year, such as gold:2015, expert:2019, or sous-chef:2009 (mainly expect to use for reviewers) // NEED TO ADD NEW FEATURE DSP_18625 // .tableName('personTable') // START-createVL_person schema.vertexLabel('person'). ifNotExists(). partitionBy('person_id', Uuid). property('name', Text). property('gender', Text). property('nickname', setOf(Text)). property('cal_goal', Int). property('macro_goal', listOf(Int)). property('country', listOf(tupleOf(Text, Date, Date))). property('badge', mapOf(Text, Date)). create() // END-createVL_person // book_discount was a property in the old data model that had a ttl; I'm including here to use the same datasets // Add as an added property //property('book_discount', Text). // START-createVL_book schema.vertexLabel('book'). ifNotExists(). partitionBy('book_id', Int). property('name', Text). property('publish_year', Int). property('isbn', Text). property('category', setOf(Text)). create() // END-createVL_book // Going to create vertexLabel recipe through converting a CQL table to a VL // Although the notebook shows creating a table for recipe with CQL, then converting, // this is the Gremlin schema to make the recipe vertex label // START-createVL_recipe schema.vertexLabel('recipe'). ifNotExists(). partitionBy('recipe_id', Int). property('name', Text). property('cuisine', setOf(Text)). property('instructions', Text). property('notes', Text). create() // END-createVL_recipe // START-createVL_item_meal schema.vertexLabel('meal_item'). ifNotExists(). partitionBy('item_id', Int). property('name', Text). property('serv_amt', Text). property('macro', listOf(Int)). property('calories', Int). create() // END-createVL_item_meal // START-createVL_ingredient schema.vertexLabel('ingredient'). ifNotExists(). partitionBy('ingred_id', Int). property('name', Text). create() // END-createVL_ingredient // START-createVL_home schema.vertexLabel('home'). ifNotExists(). partitionBy('home_id', Int). property('name', Text). create() // END-createVL_home // START-createVL_store schema.vertexLabel('store'). ifNotExists(). partitionBy('store_id', Int). property('name', Text). create() // END-createVL_store // MULTIPLE-KEY VERTEX ID // START-createVL_meal schema.vertexLabel('meal'). ifNotExists(). partitionBy('type', Text). partitionBy( 'meal_id', Int). create() // END-createVL_meal // COMPOSITE KEY VERTEX ID // START-createVL_fridge_sensor schema.vertexLabel('fridge_sensor'). ifNotExists(). partitionBy('state_id', Int). partitionBy('city_id', Int). partitionBy('zipcode_id', Int). clusterBy('sensor_id', Int). property('name', Text). create() // END-createVL_fridge_sensor // GEOSPATIAL // START-createVL_location schema.vertexLabel('location'). ifNotExists(). partitionBy('loc_id', Text). property('name', Text). property('loc_details', frozen(typeOf('location_details'))). property('geo_point', Point). create() // END-createVL_location // STATIC COLUMN // START-createVL_flag schema.vertexLabel('flag'). ifNotExists(). partitionBy('country_id', Int). clusterBy('country', Text). property('flag', Text, Static). create() // END-createVL_flag
Each property must be defined with a valid CQL data type. By default, properties have single cardinality, but can be defined with multiple cardinality using collections. Multiple cardinality allows more than one value to be assigned to a property.
In addition, properties can have their own properties using nested collections such as
listOf(tupleOf)
; in older versions these properties were called meta-properties. Notice that vertex labels can be created withifNotExists()
, to prevent overwriting a definition that already exists.DataStax Graph limits the number of vertex and edge labels to 200 per graph.
-
Add property to a vertex label
-
The schema for vertex labels defines the label type, at least one partition key, optional clustering columns, and properties. Additionally, properties can be added later:
// START-addVLProp schema.vertexLabel('book'). addProperty('book_discount', Text). alter() // END-addVLProp
Note the use of
alter()
to change the vertex label schema. -
Edge label schema
-
Create the food edge labels:
// ******** // EDGE LABELS // ******** // SYNTAX: //schema.edgeLabel('edgeLabel'). // [ materializedView('indexName'). | secondaryIndex('indexName'). | searchIndex('indexName'). | inverse(). ] // [ by('propertyName'). ] // [ tableName('tableName'). ] // [ ifNotExists(). ] // from('vertexLabel'). // to('vertexLabel'). // [ partitionBy('propertyName', propertyType). [ ... ] ] // [ clusterBy('propertyName', propertyType). [ ... ] ] // [ property('propertyName', propertyType). ] // [ create() | describe() | drop() | // addProperty('propertyName', propertyType).alter() | // dropProperty('propertyName', propertyType).alter() ] // [fromExistingTable('tableName') // from('vertexLabel'). [ mappingProperty('CQLPropertyName'). ] // to('vertexLabel'). [ mappingProperty('CQLPropertyName'). ]] // ******** // START-createELs_person_authored_book schema.edgeLabel('authored'). ifNotExists(). from('person').to('book'). create() // END-createELs_person_authored_book // START-createELs_person_ate_meal schema.edgeLabel('ate'). tableName('person_eating'). ifNotExists(). from('person').to('meal'). property('meal_date', Date). create() // END-createELs_person_ate_meal // START-createELs_person_knows_person schema.edgeLabel('knows'). ifNotExists(). from('person').to('person'). property('since', Date). create() // END-createELs_person_knows_person // START-createELs_meal_includes_mealItem schema.edgeLabel('includes'). ifNotExists(). from('meal').to('meal_item'). property('num_serv', Int). create() // END-createELs_meal_includes_mealItem // START-createELs_recipe_includes_ingredient schema.edgeLabel('includes'). ifNotExists(). from('recipe').to('ingredient'). property('amount', Text). create() // END-createELs_recipe_includes_ingredient // START-createELs_recipe_included_in_meal schema.edgeLabel('included_in'). ifNotExists(). from('recipe').to('meal'). property('amount', Text). create() // END-createELs_recipe_included_in_meal // START-createELs_recipe_included_in_book schema.edgeLabel('included_in'). ifNotExists(). from('recipe').to('book'). create() // END-createELs_recipe_included_in_book // START-createELs_person_created_recipe schema.edgeLabel('created'). ifNotExists(). from('person').to('recipe'). property('create_date', Date). create() // END-createELs_person_created_recipe // START-createELs_person_reviewed_recipe schema.edgeLabel('reviewed'). ifNotExists(). from('person').to('recipe'). property('time', Time). property('year', Date). property('stars', Int). property('comment', Text). create() // END-createELs_person_reviewed_recipe // START-createELs_fridge_sensor_contains_ingredient schema.edgeLabel('contains'). ifNotExists(). from('fridge_sensor').to('ingredient'). property('expire_date', Date). create() // END-createELs_fridge_sensor_contains_ingredient // START-createELs_store_is_stocked_with_ingredient schema.edgeLabel('is_stocked_with'). ifNotExists(). from('store').to('ingredient'). property('expire_date', Date). create() // END-createELs_store_is_stocked_with_ingredient // START-createELs_home_is_located_at_location schema.edgeLabel('is_located_at'). ifNotExists(). from('home').to('location'). create() // END-createELs_home_is_located_at_location // START-createELs_store_isLocatedAt_location schema.edgeLabel('is_located_at'). ifNotExists(). from('store').to('location'). create() // END-createELs_store_isLocatedAt_location //START-createELs_fridge_sensor_is_located_at_home schema.edgeLabel('is_located_at'). ifNotExists(). from('fridge_sensor').to('home'). create() //END-createELs_fridge_sensor_is_located_at_home
The schema for edge labels defines the label type, and defines the two vertex labels that are connected by the edge label with
from()
andto()
. Thereviewed
edge label defines edges between adjacent vertices with the outgoing vertex labelperson
and the incoming vertex labelrecipe
. By default, edges have single cardinality. To specify multiple edges between two unique vertex labels, a distinguishing edge property must be included in the edge label schema. -
Add properties to vertex labels or edge labels
-
Alter the vertex label
book
by adding the propertybook_discount
:schema.vertexLabel('book'). addProperty('book_discount', Text). alter()
Properties can also be added to edge labels using the same steps.
-
Add a property to an edge label to demonstrate dropping a property later in the QuickStart with either:
schema.edgeLabel('authored'). addProperty('one', Int). addProperty('two', Int). alter()
or with additional information about which particular edge label between two defined vertex labels:
schema.edgeLabel('authored'). from('person'). to('book'). addProperty('one', Int). addProperty('two', Int). alter()