Getting started - quick start with DataStax Studio

Insert data and run traversals.

Graph databases are useful for discovering simple and complex relationships between objects. Relationships are fundamental to how objects interact with one another and their environment. Graph databases perfectly represent the relationships between objects.

Graph databases consist of three elements:
vertex
A vertex is an object, such as a person, location, automobile, recipe, or anything else you can think of as nouns.
edge
An edge defines the relationship between two vertices. A person can create software, or an author can write a book. Think verbs when defining edges.
property
A key-value pair that describes some attribute of either a vertex or an edge. Property key is used to describe the key in the key-value pair. All properties are global in DSE Graph, meaning that a property can be used for any vertices. For example, "name" can be used for all vertices in a graph.
Vertices, edges and properties can have properties; for this reason, DSE Graph is classified as a property graph. The properties for elements are an important element of storing and querying information in a property graph.

Property graphs are typically quite large, although the nature of querying the graph varies depending on whether the graph has large numbers of vertices, edges, or both vertices and edges. To get started with graph database concepts, a toy graph is used for simplicity. The example used here explores the world of food.

Recipe Toy Graph

Elements are labeled to distinguish the type of vertices and edges in a graph database using vertex labels and edge labels. A vertex labeled author holds information about an author. An edge between an author and a book is labeled authored. Specifying appropriate labels is an important step in graph data modeling.

Vertices and edges generally have properties. For instance, an author vertex can have properties name and gender. Edges can also have properties. A created edge can have a year property that identifies when the adjoining recipe vertex was created.

Information in a graph database is retrieved using graph traversals. Graph traversals walk a graph with a single or series of traversal steps from a defined starting point and filter each step until returning a result.

To retrieve information using graph traversals, you must first insert data. The steps listed in this section allow you to gain a rudimentary understanding of DSE Graph with a minimum amount of configuration and schema creation.

Procedure

  1. Install DataStax Enterprise.
  2. Start DataStax Enterprise with DSE Graph enabled.
  3. Install and start DataStax Studio. Also create a Studio notebook, if needed.
  4. In DataStax Studio, create a new connection. Choose a graph name; any graph previously unused will work.

    A connection in Studio defines the graph and assigns a graph traversal g for that graph. A graph traversal is the mechanism for visiting each vertex in a graph, based on the filters defined in the graph traversal. To query DSE Graph, the graph traversal g must be assigned to a particular graph; Studio manages this assignment with connections.

  5. In DataStax Studio, create a new notebook. Select the connection created in the last step. Each notebook is connected to a particular graph. Multiple notebooks can be connected to the same graph.

    A blank notebook opens with a single cell. DSE Graph runs a Gremlin Server tinkerpop.server on each DataStax Enterprise node. DataStax Studio automatically connects to the Gremlin Server, and if it doesn't exist, it creates a graph using the connection information. The graph is stored as one graph instance per Cassandra keyspace. Once a graph exists, a graph traversal g is configured that allows graph traversals to be executed to query the graph. A graph traversal is bound to a specific traversal source, which by default is the standard OLTP traversal engine. The graph commands can add vertices and edges to the database, or get other graph information. The g commands can query or add vertices and edges.

  6. First, set the schema mode to Development. Development is a more lenient mode that allows schema to be added at any time during testing. Also allow full scans for testing purposes to inspect the data with broad graph traversals. For production, Production schema mode should be set to prevent interactive schema changes that can lead to anomalous behavior, and full scans should be turned off.
    schema.config().option('graph.schema_mode').set('Development')
    schema.config().option('graph.allow_scan').set('true')
  7. To check the number of vertices that exist in the graph, use the traversal step count(). There should currently be none, because we have not added data yet. A graph traversal g is chained with V() to retrieve all vertices and count() to compute the number of vertices. Chaining executes sequential traversal steps in the most efficient order.
    g.V().count()
    CAUTION:
    Be aware that queries doing full graph scans with g.V().count() should not be run on large graphs! If multiple DSE nodes are configured, this traversal step intensively walks all partitions on all nodes in the cluster that have graph data.

Simple example

Let's start with a simple example from the recipe data model. The data is composed of two vertices, one author (Julia Child) and one book (The Art of French Cooking, Vol. 1) with an edge between them to identify that Julia Child authored that book. Without creating any schema, the three elements can be created as shown below. However, DSE Graph makes a best guess at the schema, as we'll talk about below.

  1. First, make a vertex for Julia Child. The vertex label is author and two property key-value pairs are created for name and gender. Note that a label designates the key for a key-value pair that sets the vertex label. Run the command below and look at the results using the buttons to display the Raw, Table, and Graph views.
    juliaChild = graph.addVertex(label,'author', 'name','Julia Child', 'gender','F')
    Each view displays the same information:
    • auto-generated id, consisting of a member_id, a community_id and a label

      The member_id and the community_id group vertices within the graph storage structure (see Anatomy of a Graph Traversal)

    • vertex label
    • properties, name and gender, and their values
    Note: Standard auto-generated ids are deprecated with DSE 6.0. Custom ids will undergo changes, and specifying vertex ids with partitionKey and clusteringKey will likely become the normal method.
    As illustrated in the next command, a property key can be reused for different types of information. Properties are global in the sense that they can be used with multiple vertex labels. However, it is important to understand that you must specify a vertex label in conjunction with a property in a graph traversal.

    Run the next command to create a book vertex. Be careful not to run any command twice, or you'll create a duplicate in the graph!

  2. Create a book in the graph:
    artOfFrenchCookingVolOne = graph.addVertex(label, 'book','name', 'The Art of French Cooking, Vol. 1', 'year', 1961)

    As with the author vertex, you can see all the information about the book vertex created. In Graph view, use the Settings button (the gear) to change the display label for author by entering Chef {{name}}. Change the book display label with {{label}}:{{name}}.

  3. Run the next two commands. The first command creates the edge between the author and book vertices. The second command is a graph traversal that retrieves the two vertices and the edge that connects them. Use Graph view to see the relationship. Scroll over elements to display additional information.
    juliaChild.addEdge('authored', artOfFrenchCookingVolOne)
    g.V()

    We now have data!

  4. Ensure that the data inserted for the author is correct by checking with a has() step using the vertex label author and the property name = Julia Child. This graph traversal is a basic starting point for more complex traversals, because it narrows the search of the graph with specific information.
    g.V().has('author', 'name', 'Julia Child')
    Use the Table view to look at the results, as it is much more readable than the Raw view.

    The vertex information is displayed for the author vertex for Julia Child. A vertex label specifies the type of vertex, and the key-value pairs identify the property key and its value for name and gender. The automatically generated id consists of a vertex label and two components associated with the location of the vertex within the graph. Anatomy of a Graph Traversal explains the id components.

  5. Another useful traversal is valueMap(), which prints the key-value listing of each property value for specified vertices. .
    g.V().hasLabel('author').valueMap()
    CAUTION:
    Using valueMap() without specifying properties can result in slow query latencies, if a large number of property keys exist for the queried vertex or edge. Specific properties can be specified, such as valueMap('name').
  6. If only the value for a particular property key is desired, use the values() traversal step. This example gets the name of all vertices.
    g.V().values('name')

    Only two vertices exists, so two results are written. If multiple vertices exist, the traversal step returns results for all vertices that have a name.

  7. Edge information can also be retrieved. This command filters all edges to find those with an edge label authored. The edge information displays details about the incoming and outgoing vertices as well as edge parameters id, label, and type.
    g.E().hasLabel('authored')
    {
      "id": "{out_vertex={member_id=0, community_id=1372852736, ~label=author}, 
                local_id=ca2fad30-0e55-11e6-b5e4-0febe4822aa4, 
                in_vertex={member_id=0, community_id=14617472, ~label=book}, ~type=authored}",
      "label": "authored",
      "type": "edge",
      "inVLabel": "book",
      "outVLabel": "author",
      "inV": "book:14617472:0",
      "outV": "author:1372852736:0"
    }
  8. The traversal step count() is useful for counting both the number of vertices and edges. To count edges, substitute E() for V(). You should have one edge.
    g.E().count()
  9. Re-running the vertex count traversal done at the beginning of this tutorial should now yield two vertices.
    g.V().count()

Creating schema

Before adding more data to the graph, let's stop and talk about schema. Schema defines the possible properties and their data types for the graph. These properties are then used in the definitions of vertex labels and edge labels. The last critical step in schema creation is index creation. Indexes play an important role in making graph traversals efficient and fast. See Creating graph schema using Studio and Indexing graph data for more information.

First, let's create schema for the property keys. In the next two cells, the first command clears the schema for the previously created vertices and edge. After the schema creation is completed, the next step is to enter data for those elements again in a longer script.

Note: DSE Graph has two schema modes, Production and Development. In Production mode, all schema must be identified before data is entered. In Development mode, schema can be created after data is entered.

  1. Create the new schema:
    1. Clear the schema:
      schema.clear()
    2. Create the new property key schema:
      // Property Keys 
      // Check for previous creation of property key with ifNotExists() 
      schema.propertyKey('name').Text().ifNotExists().create()        
      schema.propertyKey('gender').Text().create()
      schema.propertyKey('instructions').Text().create()
      schema.propertyKey('category').Text().create()
      schema.propertyKey('year').Int().create()
      schema.propertyKey('timestamp').Timestamp().create()
      schema.propertyKey('ISBN').Text().create()
      schema.propertyKey('calories').Int().create()
      schema.propertyKey('amount').Text().create()
      schema.propertyKey('stars').Int().create()
      // single() is optional, as it is the default
      schema.propertyKey('comment').Text().single().create()
      // Example of a multiple property that can have several values
      // schema.propertyKey('nickname').Text().multiple().create() // Next 2 lines define two properties, then create a meta-property 'livedIn' on 'country'  
      // A meta-property is a property of a property
      // EX: 'livedIn': '1999-2005' 'country': 'Belgium' 
      schema.propertyKey('livedIn').Text().create()                        
      schema.propertyKey('country').Text().multiple().properties('livedIn').create()

      Each property must be defined with a data type. DSE Graph data types are aligned with Cassandra data types. The data types used here are Text, Int, and Timestamp. By default, properties have single cardinality, but can be defined with multiple cardinality. Multiple cardinality allows more than one value to be assigned to a property.

      In addition, properties can have their own properties, or meta-properties. Meta-properties can only be nested one deep, and are useful for keying information to an individual property. Notice that property keys can be created with an additional method ifNotExists(). This method prevents overwriting a definition that can already exist. After property keys are created, vertex labels and edge labels can be defined.

  2. Create the schema for vertex labels and edge labels:
    // Vertex Labels
    schema.vertexLabel('author').ifNotExists().create()
    schema.vertexLabel('recipe').create()
    // Example of creating vertex label with properties
    // schema.vertexLabel('recipe').properties('name','instructions').create()
    // Example of adding properties to a previously created vertex label      
    // schema.vertexLabel('recipe').properties('name','instructions').add()         
    schema.vertexLabel('ingredient').create()
    schema.vertexLabel('book').create()
    schema.vertexLabel('meal').create()
    schema.vertexLabel('reviewer').create()
    // Example of custom vertex id:
    // schema.propertyKey('city_id').Int().create()
    // schema.propertyKey('sensor_id').Uuid().create()
    // schema().vertexLabel('FridgeSensor').partitionKey('city_id').clusteringKey('sensor_id').create()
                    
    // Edge Labels
    schema.edgeLabel('authored').ifNotExists().create()
    schema.edgeLabel('created').create()
    schema.edgeLabel('includes').create()
    schema.edgeLabel('includedIn').create()
    schema.edgeLabel('rated').connection('reviewer','recipe').create()

    The schema for vertex labels defines the label type, and optionally defines the properties associated with the vertex label. Two methods exist for defining the association of the properties with vertex labels, either during creation or by adding them after vertex label addition. You can use the ifNotExists() for any schema creation.

    DSE Graph limits the number of vertex labels to 200 per graph.

    Vertex ids are automatically generated, but custom vertex ids can be created if necessary. The custom vertex id example shown here defines a partition key and a clustering key.

    The schema for edge labels defines the label type, and optionally defines the two vertex labels that are connected by the edge label with connection(). The rated edge label defines edges between adjacent vertices with the outgoing vertex label reviewer and the incoming vertex label recipe. By default, edges have multiple cardinality, but can be defined with single cardinality. Multiple cardinality allows more than one edge with differing property values but the same edge label to be assigned.

  3. Create the index schema:
    // Vertex Indexes
    // Secondary
    schema.vertexLabel('author').index('byName').secondary().by('name').add()
    // Materialized	  		
    schema.vertexLabel('recipe').index('byRecipe').materialized().by('name').add()
    schema.vertexLabel('meal').index('byMeal').materialized().by('name').add()
    schema.vertexLabel('ingredient').index('byIngredient').materialized().by('name').add()
    schema.vertexLabel('reviewer').index('byReviewer').materialized().by('name').add()
    // Search
    // schema.vertexLabel('recipe').index('search').search().by('instructions').asText().add()
    // schema.vertexLabel('recipe').index('search').search().by('instructions').asString().add()
    // If more than one property key is search indexed
    // schema.vertexLabel('recipe').index('search').search().by('instructions').asText().by('category').asString().add()
    
    // Property index using meta-property 'livedIn': 
    schema.vertexLabel('author').index('byLocation').property('country').by('livedIn').add()
    
    // Edge Index
    schema.vertexLabel('reviewer').index('ratedByStars').outE('rated').by('stars').add()

    Indexing is a complex and highly important subject. Here, several types of indexes are created. Briefly, secondary and materialized indexes are two types of indexes that use Cassandra built-in indexing. Search indexes use DSE Search which is Solr-based. Only one search index per vertex label is allowed, but multiple properties can be included. Property indexes allow meta-properties indexed. Edge indexes allow properties on edges to be indexed. Note that indexes are added with add() to previously created vertex labels.

  4. Examine the schema:
    schema.describe()

    The schema.describe() command displays schema you can use to recreate the schema entered. If you enter data without creating schema, you can use this command verify the data types set for each property.

    Currently, in DSE Graph, schema once created cannot be modified. Additional properties, vertex labels, edge labels, and indexes can be created, but the data type of a property, for instance, cannot be changed. While entering data without schema creation is useful while developing and learning, it is strongly recommended against for actual applications. As a reminder, Production mode disallows schema creation once data is loaded.

  5. To find only the schema for a particular type of item in the describe() listing, use the following command:
    schema.describe().split('\n').grep(~/.*index.*/)

    Additional steps can split the output per newline and grep for a string as shown for index. The Gremlin variant used here is based on Apache Groovy, so any Groovy commands can be used to manipulate graph traversals. Apache Groovy is a language that smoothly integrates with Java to provide scripting capabilities.

Adding more data

  1. Now that schema is created, add more vertices and edges using the following script o explore more connections in the recipe data model. Enter the following lines in a single DataStax Studio cell and execute. Notice that the first command, g.V().drop().iterate() drop all vertex and edge data before reading in new data. Be sure to select the Graph view after running the script.
    // generateRecipe.groovy
    
    // Add all vertices and edges for Recipe
    g.V().drop().iterate()
    
    // author vertices
    juliaChild = graph.addVertex(label, 'author', 'name','Julia Child', 'gender', 'F')
    simoneBeck = graph.addVertex(label, 'author', 'name', 'Simone Beck', 'gender', 'F')
    louisetteBertholie = graph.addVertex(label, 'author', 'name', 'Louisette Bertholie', 'gender', 'F')
    patriciaSimon = graph.addVertex(label, 'author', 'name', 'Patricia Simon', 'gender', 'F')
    aliceWaters = graph.addVertex(label, 'author', 'name', 'Alice Waters', 'gender', 'F')
    patriciaCurtan = graph.addVertex(label, 'author', 'name', 'Patricia Curtan', 'gender', 'F')
    kelsieKerr = graph.addVertex(label, 'author', 'name', 'Kelsie Kerr', 'gender', 'F')
    fritzStreiff = graph.addVertex(label, 'author', 'name', 'Fritz Streiff', 'gender', 'M')
    emerilLagasse = graph.addVertex(label, 'author', 'name', 'Emeril Lagasse', 'gender', 'M')
    jamesBeard = graph.addVertex(label, 'author', 'name', 'James Beard', 'gender', 'M')
    
    // book vertices
    artOfFrenchCookingVolOne = graph.addVertex(label, 'book', 'name', 'The Art of French Cooking, Vol. 1', 'year', 1961)
    simcasCuisine = graph.addVertex(label, 'book', 'name', "Simca's Cuisine: 100 Classic French Recipes for Every Occasion", 'year', 1972, 'ISBN', '0-394-40152-2')
    frenchChefCookbook = graph.addVertex(label, 'book', 'name','The French Chef Cookbook', 'year', 1968, 'ISBN', '0-394-40135-2')
    artOfSimpleFood = graph.addVertex(label, 'book', 'name', 'The Art of Simple Food: Notes, Lessons, and Recipes from a Delicious Revolution', 'year', 2007, 'ISBN', '0-307-33679-4')
    
    // recipe vertices
    beefBourguignon = graph.addVertex(label, 'recipe', 'name', 'Beef Bourguignon', 'instructions', 'Braise the beef. Saute the onions and carrots. Add wine and cook in a dutch oven at 425 degrees for 1 hour.')
    ratatouille = graph.addVertex(label, 'recipe', 'name', 'Rataouille', 'instructions', 'Peel and cut the eggplant. Make sure you cut eggplant into lengthwise slices that are about 1-inch wide, 3-inches long, and 3/8-inch thick')
    saladeNicoise = graph.addVertex(label, 'recipe', 'name', 'Salade Nicoise', 'instructions', 'Take a salad bowl or platter and line it with lettuce leaves, shortly before serving. Drizzle some olive oil on the leaves and dust them with salt.')
    wildMushroomStroganoff = graph.addVertex(label, 'recipe', 'name', 'Wild Mushroom Stroganoff', 'instructions', 'Cook the egg noodles according to the package directions and keep warm. Heat 1 1/2 tablespoons of the olive oil in a large saute pan over medium-high heat.')
    spicyMeatloaf = graph.addVertex(label, 'recipe', 'name', 'Spicy Meatloaf', 'instructions', 'Preheat the oven to 375 degrees F. Cook bacon in a large skillet over medium heat until very crisp and fat has rendered, 8-10 minutes.')
    oystersRockefeller = graph.addVertex(label, 'recipe', 'name', 'Oysters Rockefeller', 'instructions', 'Saute the shallots, celery, herbs, and seasonings in 3 tablespoons of the butter for 3 minutes. Add the watercress and let it wilt.')
    carrotSoup = graph.addVertex(label, 'recipe', 'name', 'Carrot Soup', 'instructions', 'In a heavy-bottomed pot, melt the butter. When it starts to foam, add the onions and thyme and cook over medium-low heat until tender, about 10 minutes.')
    roastPorkLoin = graph.addVertex(label, 'recipe', 'name', 'Roast Pork Loin', 'instructions', 'The day before, separate the meat from the ribs, stopping about 1 inch before the end of the bones. Season the pork liberally inside and out with salt and pepper and refrigerate overnight.')
    
    // ingredients vertices
    beef = graph.addVertex(label, 'ingredient', 'name', 'beef')
    onion = graph.addVertex(label, 'ingredient', 'name', 'onion')
    mashedGarlic = graph.addVertex(label, 'ingredient', 'name', 'mashed garlic')
    butter = graph.addVertex(label, 'ingredient', 'name', 'butter')
    tomatoPaste = graph.addVertex(label, 'ingredient', 'name', 'tomato paste')
    eggplant = graph.addVertex(label, 'ingredient', 'name', 'eggplant')
    zucchini = graph.addVertex(label, 'ingredient', 'name', 'zucchini')
    oliveOil = graph.addVertex(label, 'ingredient', 'name', 'olive oil')
    yellowOnion = graph.addVertex(label, 'ingredient', 'name', 'yellow onion')
    greenBean = graph.addVertex(label, 'ingredient', 'name', 'green beans')
    tuna = graph.addVertex(label, 'ingredient', 'name', 'tuna')
    tomato = graph.addVertex(label, 'ingredient', 'name', 'tomato')
    hardBoiledEgg = graph.addVertex(label, 'ingredient', 'name', 'hard-boiled egg')
    eggNoodles = graph.addVertex(label, 'ingredient', 'name', 'egg noodles')
    mushroom = graph.addVertex(label, 'ingredient', 'name', 'mushrooms')
    bacon = graph.addVertex(label, 'ingredient', 'name', 'bacon')
    celery = graph.addVertex(label, 'ingredient', 'name', 'celery')
    greenBellPepper = graph.addVertex(label, 'ingredient', 'name', 'green bell pepper')
    groundBeef = graph.addVertex(label, 'ingredient', 'name', 'ground beef')
    porkSausage = graph.addVertex(label, 'ingredient', 'name', 'pork sausage')
    shallot = graph.addVertex(label, 'ingredient', 'name', 'shallots')
    chervil = graph.addVertex(label, 'ingredient', 'name', 'chervil')
    fennel = graph.addVertex(label, 'ingredient', 'name', 'fennel')
    parsley = graph.addVertex(label, 'ingredient', 'name', 'parsley')
    oyster = graph.addVertex(label, 'ingredient', 'name', 'oyster')
    pernod = graph.addVertex(label, 'ingredient', 'name', 'Pernod')
    thyme = graph.addVertex(label, 'ingredient', 'name', 'thyme')
    carrot = graph.addVertex(label, 'ingredient', 'name', 'carrots')
    chickenBroth = graph.addVertex(label, 'ingredient', 'name', 'chicken broth')
    porkLoin = graph.addVertex(label, 'ingredient', 'name', 'pork loin')
    redWine = graph.addVertex(label, 'ingredient', 'name', 'red wine')
    
    // meal vertices
    // timestamp can also be entered as '2015-01-01' without Instant.parse()
    SaturdayFeast = graph.addVertex(label, 'meal', 'name', 'Saturday Feast', 'timestamp', '2015-11-30', 'calories', 1000)
    EverydayDinner = graph.addVertex(label, 'meal', 'name', 'EverydayDinner', 'timestamp', '2016-01-14', 'calories', 600)
    JuliaDinner = graph.addVertex(label, 'meal', 'name', 'JuliaDinner', 'timestamp', '2016-01-14', 'calories', 900)
    
    // author-book edges
    juliaChild.addEdge('authored', artOfFrenchCookingVolOne)
    simoneBeck.addEdge('authored', artOfFrenchCookingVolOne)
    louisetteBertholie.addEdge('authored', artOfFrenchCookingVolOne)
    simoneBeck.addEdge('authored', simcasCuisine)
    patriciaSimon.addEdge('authored', simcasCuisine)
    juliaChild.addEdge('authored', frenchChefCookbook)
    aliceWaters.addEdge('authored', artOfSimpleFood)
    patriciaCurtan.addEdge('authored', artOfSimpleFood)
    kelsieKerr.addEdge('authored', artOfSimpleFood)
    fritzStreiff.addEdge('authored', artOfSimpleFood)
    
    // author - recipe edges
    juliaChild.addEdge('created', beefBourguignon, 'year', 1961)
    juliaChild.addEdge('created', ratatouille, 'year', 1965)
    juliaChild.addEdge('created', saladeNicoise, 'year', 1962)
    emerilLagasse.addEdge('created', wildMushroomStroganoff, 'year', 2003)
    emerilLagasse.addEdge('created', spicyMeatloaf, 'year', 2000)
    aliceWaters.addEdge('created', carrotSoup, 'year', 1995)
    aliceWaters.addEdge('created', roastPorkLoin, 'year', 1996)
    jamesBeard.addEdge('created', oystersRockefeller, 'year', 1970)
    
    // recipe - ingredient edges
    beefBourguignon.addEdge('includes', beef, 'amount', '2 lbs')
    beefBourguignon.addEdge('includes', onion, 'amount', '1 sliced')
    beefBourguignon.addEdge('includes', mashedGarlic, 'amount', '2 cloves')
    beefBourguignon.addEdge('includes', butter, 'amount', '3.5 Tbsp')
    beefBourguignon.addEdge('includes', tomatoPaste, 'amount', '1 Tbsp')
    ratatouille.addEdge('includes', eggplant, 'amount', '1 lb')
    ratatouille.addEdge('includes', zucchini, 'amount', '1 lb')
    ratatouille.addEdge('includes', mashedGarlic, 'amount', '2 cloves')
    ratatouille.addEdge('includes', oliveOil, 'amount', '4-6 Tbsp')
    ratatouille.addEdge('includes', yellowOnion, 'amount', '1 1/2 cups or 1/2 lb thinly sliced')
    saladeNicoise.addEdge('includes', oliveOil, 'amount', '2-3 Tbsp')
    saladeNicoise.addEdge('includes', greenBean, 'amount', '1 1/2 lbs blanched, trimmed')
    saladeNicoise.addEdge('includes', tuna, 'amount', '8-10 ozs oil-packed, drained and flaked')
    saladeNicoise.addEdge('includes', tomato, 'amount', '3 or 4 red, peeled, quartered, cored, and seasoned')
    saladeNicoise.addEdge('includes', hardBoiledEgg, 'amount', '8 halved lengthwise')
    wildMushroomStroganoff.addEdge('includes', eggNoodles, 'amount', '16 ozs wmyIde')
    wildMushroomStroganoff.addEdge('includes', mushroom, 'amount', '2 lbs wild or exotic, cleaned, stemmed, and sliced')
    wildMushroomStroganoff.addEdge('includes', yellowOnion, 'amount', '1 cup thinly sliced')
    spicyMeatloaf.addEdge('includes', bacon, 'amount', '3 ozs diced')
    spicyMeatloaf.addEdge('includes', onion, 'amount', '2 cups finely chopped')
    spicyMeatloaf.addEdge('includes', celery, 'amount', '2 cups finely chopped')
    spicyMeatloaf.addEdge('includes', greenBellPepper, 'amount', '1/4 cup finely chopped')
    spicyMeatloaf.addEdge('includes', porkSausage, 'amount', '3/4 lbs hot')
    spicyMeatloaf.addEdge('includes', groundBeef, 'amount', '1 1/2 lbs chuck')
    oystersRockefeller.addEdge('includes', shallot, 'amount', '1/4 cup chopped')
    oystersRockefeller.addEdge('includes', celery, 'amount', '1/4 cup chopped')
    oystersRockefeller.addEdge('includes', chervil, 'amount', '1 tsp')
    oystersRockefeller.addEdge('includes', fennel, 'amount', '1/3 cup chopped')
    oystersRockefeller.addEdge('includes', parsley, 'amount', '1/3 cup chopped')
    oystersRockefeller.addEdge('includes', oyster, 'amount', '2 dozen on the half shell')
    oystersRockefeller.addEdge('includes', pernod, 'amount', '1/3 cup')
    carrotSoup.addEdge('includes', butter, 'amount', '4 Tbsp')
    carrotSoup.addEdge('includes', onion, 'amount', '2 medium sliced')
    carrotSoup.addEdge('includes', thyme, 'amount', '1 sprig')
    carrotSoup.addEdge('includes', carrot, 'amount', '2 1/2 lbs, peeled and sliced')
    carrotSoup.addEdge('includes', chickenBroth, 'amount', '6 cups')
    roastPorkLoin.addEdge('includes', porkLoin, 'amount', '1 bone-in, 4-rib')
    roastPorkLoin.addEdge('includes', redWine, 'amount', '1/2 cup')
    roastPorkLoin.addEdge('includes', chickenBroth, 'amount', '1 cup')
    
    // book - recipe edges
    beefBourguignon.addEdge('includedIn', artOfFrenchCookingVolOne)
    saladeNicoise.addEdge('includedIn', artOfFrenchCookingVolOne)
    carrotSoup.addEdge('includedIn', artOfSimpleFood)
    
    // meal - recipe edges
    beefBourguignon.addEdge('includedIn', SaturdayFeast)
    carrotSoup.addEdge('includedIn', SaturdayFeast)
    oystersRockefeller.addEdge('includedIn', SaturdayFeast)
    carrotSoup.addEdge('includedIn', EverydayDinner)
    roastPorkLoin.addEdge('includedIn', EverydayDinner)
    beefBourguignon.addEdge('includedIn', JuliaDinner)
    saladeNicoise.addEdge('includedIn', JuliaDinner)
    
    // meal - book edges
    EverydayDinner.addEdge('includedIn', artOfSimpleFood)
    SaturdayFeast.addEdge('includedIn', simcasCuisine)
    JuliaDinner.addEdge('includedIn', artOfFrenchCookingVolOne)
    g.V()
    Data for the Recipe Toy Graph

    The property timestamp is a Timestamp data type that corresponds to a valid Cassandra timestamp data type.

    The g.V() command at the end of the script displays all the vertices created.

  2. If a vertex count is run, there is now a higher count of 56 vertices. Run the vertex count again:
    g.V().count()

    The DSE Graph Loader is available for scripting data loading. It is the recommended method for data loading.

Exploring the graph with graph traversals can lead to interesting conclusions.

  1. With several author vertices in the graph, to find a particular vertex, provide a specific name. This traversal gets the stored vertex information for the vertex that has the name of Julia Child. Note that the traversal is also constrained by an authorvertex in the has clause.
    g.V().has('author','name','Julia Child')
  2. In this next traversal, has() gets the vertex information filtered by name = Julia Child. The traversal step outE() discovers the outgoing edges from that vertex with the authored label.
    g.V().has('name','Julia Child').outE('authored')

    In DataStax Studio, either the listing of the Raw view edge information:

    or the Graph view graph visualization where scrolling over a vertex provides additional information.

  3. If instead, you want to query for the books that all authors have written, the query must be modified. The previous example retrieved edges, but not the adjacent book vertices. Add a traversal step inV() to find all the vertices that connect to the outgoing edges, then print the book titles of those vertices. Notice how the chained traversal steps go from the vertices along outgoing edges to the adjacent vertices with V().outE().inV(). The outgoing edges are given a particular filter value, authored.
    g.V().outE('authored').inV().values('name')
  4. Notice that the book titles are duplicated in the resulting list, because a listing is returned for each author. If a book has three authors, three listings are returned. The traversal step dedup() can eliminate the duplication.
    g.V().outE('authored').inV().values('name').dedup()
  5. Refine the traversal by reinserting the has() step for a particular author. Find all the books authored by Julia Child.
    g.V().has('name','Julia Child').outE('authored').inV().values('name')
  6. The previous example and this example accomplish the same result. However, the number of traversal steps and the type of traversal steps can affect performance. The traversal step outE() should be only used if the edges are explicitly required. In this example, the edges are traversed to get information about connected vertices, but the edge information is not important to the query.
    g.V().has('name','Julia Child').out('authored').values('name')

    The traversal step out() retrieves the connected book vertices based on the edge label authored without retrieving the edge information. In a larger graph traversal, this subtle difference in the traversal can become a latency issue.

  7. Additional traversal steps continue to fine-tune the results. Adding another chained has traversal step finds only books authored by Julia Child published after 1967. This example also displays the use of the gt, or greater than function.
    g.V().has('name','Julia Child').out('authored').has('year', gt(1967)).values('name')
  8. When developing or testing, often checking the number of vertices with each vertex label can confirm that data was read. To find the number of vertices by vertex label, use the traversal step label() followed by the traversal step groupCount(). The step groupCount() is useful for aggregating results from a previous step.
    g.V().label().groupCount()
  9. Write your data to an output file to save or exchange information. A Gryo file is a binary format file that can reload data to DSE Graph. In this next command, graph I/O writes the entire graph to a file. Other file formats can be written by substituting gryo() with graphml() or graphson().
    graph.io(gryo()).writeGraph("/tmp/recipe.gryo")
    Note: graph.io() is disabled in sandbox mode.
  10. To load a Gryo file, use the graphloader, after creating a mapping script:
    graphloader mappingGRYO.groovy -graph recipe -address localhost
    Details about loading Gryo data are found in Loading Gryo Data, in Using DSE Graph Loader.
  11. If you wish to use the gremlin console after working in Studio, two commands are useful: system.graphs() to get a list of all graphs and :remote config alias g some_graph.g to switch to a different graph.

What's next

Congratulations! You are well on your way to using DSE Graph for data discovery.

Further adventures in traversing can be found in Creating queries using traversals. If you want to explore various loading options, check out the DSE Graph Loader or Using DSE Graph.