Specifying Cassandra and graph settings

How to set Cassandra and graph settings for a graph.

Some DSE Graph options are set on a per-graph basis. The settings are modified and read using either System or Schema API calls in the Gremlin console. These option values are stored in Cassandra tables and are not set in the dse.yaml file. See the DSE Graph reference for a complete list of available options. Other settings for DSE Graph are also in the dse.yaml file.

Procedure

Most per-graph options are set using the Schema API.

  • Check all non-default values of configuration settings.
    schema.config().describe()
    graph.tx_groups.default.write_consistency: ALL
    graph.tx_groups.default.read_consistency: QUORUM
  • Check the values of a specific setting.
    schema.config().option('graph.tx_groups.default.write_consistency').get()
    ALL
  • Set the value of a configuration setting.
    schema.config().option('graph.tx_groups.default.write_consistency').set('ALL')
    null
  • To retrieve all traversal sources that have been set, use the get() command with the traversal source type option:
    schema.config().option('graph.traversal_sources.*.type').get()
    REAL_TIME
  • Set the maximum time to wait for a traversal to evaluate:
    schema.config().option("graph.traversal_sources.g.evaluation_timeout").set('PT2H')
    Important: Setting a timeout value of greater than 1095 days (maximum integer) can exceed the limit of a graph session. Starting a new session and setting the timeout to a lower value can recover access to a hung session. This caution is applicable for all timeouts: evaluation_timeout, system_evaluation_timeout, analytic_evaluation_timeout, and realtime_evaluation_timeout
    PT2H
    Note: The dse.yaml file has settings realtime_evaluation_timeout_in_seconds and analytic_evaluation_timeout_in_minutes that determine the timeout value used depending on whether the query is an OLTP or OLAP query, respectively. The command shown above using evaluation_timeoutwill override any system level setting for the traversal source g specified.

Some options must be set using the System API.

  • Settings can also be set while creating a new graph. For instance, replication for graph inherits Cassandra defaults, so the replication factor is set to 1 and the class is SimpleStrategy. As with Cassandra, the replication factor for graph should be set before adding data.
    gremlin>  system.graph('gizmo').
        replication("{'class' : 'NetworkTopologyStrategy', 'dc1' : 3 }").
        ifNotExists().create()
  • Graph also creates a keyspace for storing graph variables in Cassandra tables. This keyspace holds essential information, so the replication factor should be set to something higher than one replica to ensure no loss.
    gremlin>  system.graph('gizmo').    
        replication("{'class' : 'NetworkTopologyStrategy', 'dc1' : 3 }").
        systemReplication("{'class' : 'NetworkTopologyStrategy', 'dc1' : 3 }").
        ifNotExists().create()
  • Additional schema settings can be configured at graph creation.
    system.graph('food2').
      replication("{'class' : 'SimpleStrategy', 'replication_factor' : 1 }").
      systemReplication("{'class' : 'SimpleStrategy', 'replication_factor' : 1 }").
      option("graph.schema_mode").set("Development").
      option("graph.allow_scan").set("false").
      option("graph.default_property_key_cardinality").set("multiple").
      option("graph.tx_groups.*.write_consistency").set("ALL").
      create()
    More information can be found in the reference.

The Graph API is used to set some transaction settings.

  • The allow_scan option can be set at either a single graph level or as shown here, for all actions within a transaction made on a single node. This setting can be useful if a quorum cannot be mustered for writing the option change to the system table.
    graph.tx().config().option("allow_scan", true).open()
    null