Discovering properties about graphs and traversals

After schema and data are inserted into a graph, it is important to verify that the information is correct. Checking simple information about inserted data is a good way to get started with traversals. The schema().describe() can be used to check how the graph is storing data, and to retrieve schema that can be used to re-create schema, if necessary.

Procedure

  • Use the graph traversal instance g to check if data is loaded by checking the count of vertices:

    g.V().count()
    ==>100
  • Use the graph traversal instance g to check if data is loaded by checking the count of edges:

    g.E().count()
    ==>121
  • Check the properties of one vertex. Find all the information for the vertex with an name value of Julia CHILD.

    g.V().hasLabel('person').has('name', 'Julia CHILD').elementMap()
    ==>{id=dseg:/person/e7cd5752-bc0d-4157-a80f-7523add8dbcd, 
    label=person, country=[('USA','1912-08-12','1944-01-01'), 
    ('Ceylon','1944-01-01','1945-06-01'), ('France','1948-01-01','1950-01-01'), 
    ('USA','1960-01-01','2004-08-13')], gender=F, name=Julia CHILD, 
    nickname=[Jay, Julia], person_id=e7cd5752-bc0d-4157-a80f-7523add8dbcd}

    This query relies on an index for name because person_id is the partition key. If an index is not created, then only the partition key can be used to discover information about any element.

  • Check the properties of a one edge. Find all the information for the edge with a label of reviewed from a particular person to a particular recipe.

    g.E('dseg:/person-reviewed-recipe/6c09f656-5aef-46df-97f9-e7f984c9a3d9/2001').elementMap()
    i==>{id=dseg:/person-reviewed-recipe/6c09f656-5aef-46df-97f9-e7f984c9a3d9/2001, 
    label=reviewed, IN={id=dseg:/recipe/2001, label=recipe, recipe_id=2001}, 
    OUT={id=dseg:/person/6c09f656-5aef-46df-97f9-e7f984c9a3d9, 
    label=person, person_id=6c09f656-5aef-46df-97f9-e7f984c9a3d9}, year=2014-02-01, 
    comment=Yummy!, stars=5, time=12:00}
  • Find the id information for vertices:

    g.V().hasLabel('fridge_sensor').id()
    ==>dseg:/fridge_sensor/45/300/66665/1
    ==>dseg:/fridge_sensor/45/300/66665/2
    ==>dseg:/fridge_sensor/45/300/66665/3
    ==>dseg:/fridge_sensor/31/100/55555/1
    ==>dseg:/fridge_sensor/31/100/55555/2
    ==>dseg:/fridge_sensor/31/100/55555/3
    ==>dseg:/fridge_sensor/31/200/55556/1
    ==>dseg:/fridge_sensor/31/200/55556/2
    ==>dseg:/fridge_sensor/31/200/55556/3

    Edge id information can be found in a similar fashion.

  • Discover schema information using a describe() step. This traversal step provides a sorted list of the same information as the next alternative below.

    schema.describe()
    ==>schema.type('address').ifNotExists().property('address1', Varchar).property('address2', Varchar).property('city_code', Varchar).property('state_code', Varchar).property('zip_code', Varchar).create()
    schema.type('fullname').ifNotExists().property('firstname', Varchar).property('lastname', Varchar).create()
    schema.type('location_details').ifNotExists().property('loc_address', frozen(typeOf('address'))).property('telephone', listOf(Varchar)).create()
    schema.vertexLabel('book').ifNotExists().partitionBy('book_id', Int).property('book_discount', Varchar).property('isbn', Varchar).property('name', Varchar).property('publish_year', Int).property('category', setOf(Varchar)).create()
    schema.vertexLabel('flag').ifNotExists().partitionBy('country_id', Int).clusterBy('country', Varchar, Asc).property('flag', Varchar, Static).create()
    schema.vertexLabel('fridge_sensor').ifNotExists().partitionBy('state_id', Int).partitionBy('city_id', Int).partitionBy('zipcode_id', Int).clusterBy('sensor_id', Int, Asc).property('name', Varchar).create()
    schema.vertexLabel('home').ifNotExists().partitionBy('home_id', Int).property('name', Varchar).create()
    schema.vertexLabel('ingredient').ifNotExists().partitionBy('ingred_id', Int).property('name', Varchar).create()
    schema.vertexLabel('location').ifNotExists().partitionBy('loc_id', Varchar).property('geo_point', Point).property('loc_details', frozen(typeOf('location_details'))).property('name', Varchar).create()
    schema.vertexLabel('meal').ifNotExists().partitionBy('type', Varchar).partitionBy('meal_id', Int).create()
    schema.vertexLabel('meal_item').ifNotExists().partitionBy('item_id', Int).property('calories', Int).property('name', Varchar).property('serv_amt', Varchar).property('macro', listOf(Int)).create()
    schema.vertexLabel('person').ifNotExists().partitionBy('person_id', Uuid).property('cal_goal', Int).property('gender', Varchar).property('name', Varchar).property('badge', mapOf(Varchar, Date)).property('country', listOf(tupleOf(Varchar, Date, Date))).property('macro_goal', listOf(Int)).property('nickname', setOf(Varchar)).create()
    schema.vertexLabel('recipe').ifNotExists().partitionBy('recipe_id', Int).property('instructions', Varchar).property('name', Varchar).property('notes', Varchar).property('cuisine', setOf(Varchar)).create()
    schema.vertexLabel('store').ifNotExists().partitionBy('store_id', Int).property('name', Varchar).create()
    schema.edgeLabel('contains').ifNotExists().from('fridge_sensor').to('ingredient').partitionBy(OUT, 'state_id', 'fridge_sensor_state_id').partitionBy(OUT, 'city_id', 'fridge_sensor_city_id').partitionBy(OUT, 'zipcode_id', 'fridge_sensor_zipcode_id').clusterBy(OUT, 'sensor_id', 'fridge_sensor_sensor_id', Asc).clusterBy(IN, 'ingred_id', 'ingredient_ingred_id', Asc).property('expire_date', Date).create()
    schema.edgeLabel('is_located_at').ifNotExists().from('fridge_sensor').to('location').partitionBy(OUT, 'state_id', 'fridge_sensor_state_id').partitionBy(OUT, 'city_id', 'fridge_sensor_city_id').partitionBy(OUT, 'zipcode_id', 'fridge_sensor_zipcode_id').clusterBy(OUT, 'sensor_id', 'fridge_sensor_sensor_id', Asc).clusterBy(IN, 'loc_id', 'location_loc_id', Asc).create()
    schema.edgeLabel('is_located_at').ifNotExists().from('home').to('location').partitionBy(OUT, 'home_id', 'home_home_id').clusterBy(IN, 'loc_id', 'location_loc_id', Asc).create()
    schema.edgeLabel('includes').ifNotExists().from('ingredient').to('recipe').partitionBy(OUT, 'ingred_id', 'ingredient_ingred_id').clusterBy(IN, 'recipe_id', 'recipe_recipe_id', Asc).property('amount', Varchar).create()
    schema.edgeLabel('includes').ifNotExists().from('meal').to('meal_item').partitionBy(OUT, 'type', 'meal_type').partitionBy(OUT, 'meal_id', 'meal_meal_id').clusterBy(IN, 'item_id', 'meal_item_item_id', Asc).property('num_serv', Int).create()
    schema.edgeLabel('ate').ifNotExists().from('person').to('meal').partitionBy(OUT, 'person_id', 'person_person_id').clusterBy(IN, 'type', 'meal_type', Asc).clusterBy(IN, 'meal_id', 'meal_meal_id', Asc).property('meal_date', Date).create()
    schema.edgeLabel('authored').ifNotExists().from('person').to('book').partitionBy(OUT, 'person_id', 'person_person_id').clusterBy(IN, 'book_id', 'book_book_id', Asc).create()
    schema.edgeLabel('created').ifNotExists().from('person').to('recipe').partitionBy(OUT, 'person_id', 'person_person_id').clusterBy(IN, 'recipe_id', 'recipe_recipe_id', Asc).property('create_date', Date).create()
    schema.edgeLabel('knows').ifNotExists().from('person').to('person').partitionBy(OUT, 'person_id', 'out_person_id').clusterBy(IN, 'person_id', 'in_person_id', Asc).property('since', Date).create()
    schema.edgeLabel('reviewed').ifNotExists().from('person').to('recipe').partitionBy(OUT, 'person_id', 'person_person_id').clusterBy(IN, 'recipe_id', 'recipe_recipe_id', Asc).property('comment', Varchar).property('stars', Int).property('time', Time).property('year', Date).create()
    schema.edgeLabel('included_in').ifNotExists().from('recipe').to('book').partitionBy(OUT, 'recipe_id', 'recipe_recipe_id').clusterBy(IN, 'book_id', 'book_book_id', Asc).create()
    schema.edgeLabel('included_in').ifNotExists().from('recipe').to('meal').partitionBy(OUT, 'recipe_id', 'recipe_recipe_id').clusterBy(IN, 'type', 'meal_type', Asc).clusterBy(IN, 'meal_id', 'meal_meal_id', Asc).property('amount', Varchar).create()
    schema.edgeLabel('is_located_at').ifNotExists().from('store').to('location').partitionBy(OUT, 'store_id', 'store_store_id').clusterBy(IN, 'loc_id', 'location_loc_id', Asc).create()
    schema.edgeLabel('is_stocked_with').ifNotExists().from('store').to('ingredient').partitionBy(OUT, 'store_id', 'store_store_id').clusterBy(IN, 'ingred_id', 'ingredient_ingred_id', Asc).property('expire_date', Date).create()
    schema.vertexLabel('book').searchIndex().ifNotExists().by('book_id').by('name').asString().by('publish_year').create()
    schema.vertexLabel('ingredient').materializedView('ingredient_by_name').ifNotExists().partitionBy('name').clusterBy('ingred_id', Asc).create()
    schema.vertexLabel('location').materializedView('location_by_name').ifNotExists().partitionBy('name').clusterBy('loc_id', Asc).create()
    schema.vertexLabel('location').searchIndex().ifNotExists().by('loc_id').asString().by('geo_point').create()
    schema.vertexLabel('meal').materializedView('meal_by_type').ifNotExists().partitionBy('type').clusterBy('meal_id', Asc).create()
    schema.vertexLabel('meal_item').materializedView('meal_item_by_name').ifNotExists().partitionBy('name').clusterBy('item_id', Asc).create()
    schema.vertexLabel('person').materializedView('person_by_name').ifNotExists().partitionBy('name').clusterBy('person_id', Asc).create()
    schema.vertexLabel('person').secondaryIndex('person_2i_by_badge').ifNotExists().by('badge').indexKeys().create()
    schema.vertexLabel('person').secondaryIndex('person_2i_by_country').ifNotExists().by('country').indexValues().create()
    schema.vertexLabel('person').secondaryIndex('person_2i_by_nickname').ifNotExists().by('nickname').indexValues().create()
    schema.vertexLabel('person').searchIndex().ifNotExists().by('person_id').by('country').create()
    schema.vertexLabel('recipe').materializedView('recipe_by_name').ifNotExists().partitionBy('name').clusterBy('recipe_id', Asc).create()
    schema.vertexLabel('recipe').secondaryIndex('recipe_2i_by_cuisine').ifNotExists().by('cuisine').indexValues().create()
    schema.vertexLabel('recipe').searchIndex().ifNotExists().by('recipe_id').by('instructions').create()
    schema.edgeLabel('reviewed').from('person').to('recipe').materializedView('person__reviewed__recipe_by_person_person_id_stars').ifNotExists().partitionBy(OUT, 'person_id').partitionBy('stars').clusterBy(IN, 'recipe_id', Asc).create()
    schema.edgeLabel('reviewed').from('person').to('recipe').materializedView('person__reviewed__recipe_by_person_person_id_year').ifNotExists().partitionBy(OUT, 'person_id').clusterBy('year', Asc).clusterBy(IN, 'recipe_id', Asc).create()
  • An alternative to discover schema information uses a elementMap() step on the traversal:

    schema.traversal().V().elementMap()

    Partial results:

    i==>{id=0, label=schema, engine=Core}
    ==>{id=512, label=propertyKey, dataType=Time, name=time, type=Regular}
    ==>{id=2, label=vertexLabel, name=meal_item}
    ==>{id=258, label=propertyKey, name=badge}
    ==>{id=4, label=propertyKey, dataType=Int, name=item_id, type=PartitionKey}
    ==>{id=261, label=propertyKey, name=cal_goal}
    ==>{id=517, label=propertyKey, dataType=Date, name=year, type=Regular}
    ==>{id=264, label=propertyKey, name=country}
    ==>{id=9, label=propertyKey, dataType=Int, name=calories, type=Regular}
    ==>{id=522, label=edgeIndex, name=person__reviewed__recipe_by_comment, type=MaterializedView}
    ==>{id=267, label=propertyKey, name=gender}
    ==>{id=525, label=propertyKey, name=comment, type=PartitionKey}
    ==>{id=14, label=propertyKey, dataType=Varchar, name=name, type=Regular}
    ==>{id=270, label=propertyKey, name=macro_goal}
    ==>{id=273, label=propertyKey, name=name}
    ==>{id=529, label=propertyKey, name=person_person_id, type=Clustering, order=Asc}
    ==>{id=19, label=propertyKey, dataType=Varchar, name=serv_amt, type=Regular}
    ==>{id=277, label=vertexLabel, name=location}
    ==>{id=534, label=propertyKey, name=recipe_recipe_id, type=Clustering, order=Asc}
    ==>{id=279, label=propertyKey, dataType=Varchar, name=loc_id, type=PartitionKey}
    ==>{id=24, label=propertyKey, dataType=listOf(Int), name=macro, type=Regular}
    ==>{id=539, label=propertyKey, name=stars, type=Regular}
    ==>{id=284, label=propertyKey, dataType=Point, name=geo_point, type=Regular}
    ==>{id=29, label=vertexLabel, name=ingredient}
    ==>{id=31, label=propertyKey, dataType=Int, name=ingred_id, type=PartitionKey}
    ==>{id=543, label=propertyKey, name=time, type=Regular}
    ==>{id=289, label=propertyKey, dataType=frozen(typeOf('location_details')), name=loc_details, type=Regular}
    ==>{id=547, label=propertyKey, name=year, type=Regular}
    ==>{id=36, label=propertyKey, dataType=Varchar, name=name, type=Regular}
    ==>{id=294, label=propertyKey, dataType=Varchar, name=name, type=Regular}
    ==>{id=552, label=incident}
    ==>{id=41, label=vertexLabel, name=home}
    ==>{id=43, label=propertyKey, dataType=Int, name=home_id, type=PartitionKey}
    ==>{id=299, label=vertexIndex, name=food_location_solr_query_index, type=Search}
    ==>{id=556, label=edgeLabel, name=included_in}
    ==>{id=302, label=propertyKey, name=loc_id}
    ==>{id=558, label=propertyKey, dataType=Varchar, name=amount, type=Regular}
    ==>{id=48, label=propertyKey, dataType=Varchar, name=name, type=Regular}
    ==>{id=305, label=propertyKey, name=geo_point}
    ==>{id=563, label=incident}
    ==>{id=308, label=propertyKey, name=loc_details}
    ==>{id=53, label=vertexLabel, name=store}
    ==>{id=55, label=propertyKey, dataType=Int, name=store_id, type=PartitionKey}
    ==>{id=567, label=edgeLabel, name=includes}
    ==>{id=312, label=vertexLabel, name=recipe}
    ==>{id=569, label=propertyKey, dataType=Varchar, name=amount, type=Regular}
    ==>{id=314, label=propertyKey, dataType=Int, name=recipe_id, type=PartitionKey}
    ==>{id=60, label=propertyKey, dataType=Varchar, name=name, type=Regular}
    ==>{id=574, label=incident}
    ==>{id=319, label=propertyKey, dataType=Varchar, name=instructions, type=Regular}

Using elementMap() or valueMap() without specifying properties can result in slow query latencies, if a large number of property keys exist for the queried vertex or edge. Specific properties can be specified, such as elementMap('name') or``valueMap('name').

  • Running traversal() will supply information about the number of schema element exist for vertices and edges, as well as the TraversalSource type.

    schema.traversal()
    ==>graphtraversalsource[tinkergraph[vertices:140 edges:140], standard]
  • Get the name of the current graphs:

    system.graphs()
    ==>food_qs
    ==>food_cql
    ==>food

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com