Creating a graph

Creating a graph.

Depending on the DSE Graph schema mode, DataStax Studio will have differing behavior. In Production mode, DataStax Studio will not auto-create a graph, and the graph must be created in the Gremlin console. In Development mode, DataStax Studio creates a graph and aliases the graph to a graph traversal automatically for each connection that is created.

DataStax Studio creates a graph automatically for each connection that is created. In Gremlin console, a graph must be manually created. In addition to creating the graph, a graph traversal must be aliased to the graph in order to run queries.

Procedure

Studio
  1. Start DSE Graph.
  2. Install and start Studio. Also create a Studio notebook, if needed.
  3. In DataStax Studio, create a connection. Choose a graph name; any graph previously unused will work.
  4. In DataStax Studio, create a notebook. Select the connection created in the last step.

    A blank notebook will open with a single cell. DSE Graph runs a Gremlin Server tinkerpop.server on each DSE node. DataStax Studio automatically connects to the Gremlin Server, and if it doesn't exist, creates a graph using the connection information. The graph is stored as one graph instance per DSE database keyspace with a replication factor of 1 and a strategy of SimpleStrategy. Once a graph exists, a graph traversal g is configured that will allow graph traversals to be executed. Graph traversals are used to query the graph data and return results. A graph traversal is bound to a specific traversal source which is the standard OLTP traversal engine.

Gremlin console
  1. Start the Gremlin console.
  2. Create a simple graph with default settings to hold the data.
    system.graph('food').create()
    ==>null
  3. Create a graph with non-default replication, system replication, and configuration settings:
    system.graph('food2').
      replication("{'class' : 'NetworkTopologyStrategy', 'dc1' : 3 }").
      systemReplication("{'class' : 'NetworkTopologyStrategy', 'dc1' : 3 }").
      option("graph.schema_mode").set("Production").
      option("graph.allow_scan").set("false").
      option("graph.default_property_key_cardinality").set("multiple").
      option("graph.tx_groups.*.write_consistency").set("QUORUM").create()
    CAUTION: For graphs created in multi-datacenter clusters, the DSE database settings must use NetworkTopologyStrategy and a replication factor greater than one. If the graph is created with a replication setting of SimpleStrategy and a replication factor of 1, the graph data will be stored across the multiple datacenters rather than localizing the data in the graph datacenter.
    DSE database settings for replication factor are used, either SimpleStrategy for single datacenters or NetworkTopologyStrategy for multiple datacenters. The default replication strategy for a multi-datacenter graph is NetworkTopologyStrategy, whereas for a single datacenter, the replication strategy will default to SimpleStrategy. The number of nodes will determine the default replication factor:
    number of nodes per datacenter graph_name replication factor (replication) graph_name_system replication factor (systemReplication)
    1-3 number of nodes per datacenter number of nodes per datacenter
    greater than 3 3 5
    Important: Because the graph's schema is stored in graph_name_system, it is extremely important that the replication factor is set consistent with the table values above. If the graph's schema is lost, it renders the entire graph inoperable. Once set, the replication factor for these keyspaces cannot be altered.
  4. On the remote Gremlin Server, set the alias for the graph traversal g to the graph traversal specified in food. To run traversals, the graph traversal must be aliased to a graph.
    :remote config alias g food.g
    ==>g=food.g