Configuring the Gremlin Server in the dse.yaml file

How to configure the Gremlin Server in the dse.yaml file.

The Gremlin Server is configured using Apache TinkerPop specifications in the dse.yaml file. For the full list of configuration options, see the TinkerPop documentation.

Of particular note, the Graph sandbox is configured in the Gremlin Server options of the dse.yaml file. This feature is enabled by default and provides protection from malicious attacks within the JVM.

For DSE Graph, the top-level configurations in Gremlin Server are embedded in a key called gremlin_server. Although most configuration options for Gremlin Server come from these settings, the following options are ignored and overridden:

  • authentication: This setting is ignored as Gremlin Server delegates all security to DSE.
  • channelizer: The Channelizer implementation for Gremlin Server is always set to DseWebSocketChannelizer which includes hooks into DSE audit and authentication features.
  • graphs: The list of graphs are completely ignored. Graph instances are created when they are requested.
  • host: The host is ignored, it is always equal to rpc_address from cassandra.yaml
  • processors: The StandardOpProcessor and SessionOpProcessor that are shipped with Gremlin Server internally are overwritten. DataStax recommends not attempting to include additional OpProcessor implementations.
  • scriptEvaluationTimeout: A number of different timeout settings are configured globally in this dse.yaml file and specifically on individual graph traversal source instances.
  • ssl: Gremlin Server uses the same SSL context as DataStax Enterprise.
  • strictTransactionManagement: All requests to DataStax Enterprise have transaction management enabled.
This example show the basic features that are configured.
# This is the default configuration for Gremlin Server. For more information about these 
  # configuration options, see the TinkerPop documentation:
  #
  # http://tinkerpop.apache.org/docs/3.2.0-incubating/reference/#_configuring_2
  #
  # Note that for DSE Graph, the top-level configurations in Gremlin Server are embedded in a key called
  # gremlin_server. It is further important to understand that while most configuration options for
  # Gremlin Server come from these settings, there are some that are ignored and overridden:
  #
  # - authentication: This setting is ignored as Gremlin Server delegates all security to DSE.
  # - channelizer: The Channelizer implementation for Gremlin Server is always set to
  #   DseWebSocketChannelizer which includes hooks into DSE audit and authentication features.
  # - graphs: The list of graphs are completely ignored. Graph instances are created when they are
  #   requested.
  # - host: The host is ignored, it is always equal to rpc_address from cassandra.yaml
  gremlin_server:
    port: 8182  
    threadPoolWorker: 2   
    gremlinPool: 8
    maxContentLength: 65536000
    maxChunkSize: 4096000
    maxInitialLineLength: 4096
    maxHeaderSize: 8192
    maxAccumulationBufferComponents: 1024
    resultIterationBatchSize: 64
    useEpollEventLoop: false
    serializers:
      - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistry], classResolverSupplier: com.datastax.bdp.graph.impl.tinkerpop.io.DseClassResolverProvider }}
      - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
      - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistry] }}
      - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistry] }}
    scriptEngines:
      gremlin-groovy:
        config:
          compilerCustomizerProviders:
            "org.apache.tinkerpop.gremlin.groovy.jsr223.customizer.ThreadInterruptCustomizerProvider": []
            "org.apache.tinkerpop.gremlin.groovy.jsr223.customizer.InterpreterModeCustomizerProvider": []
gremlin_server
The top-level configurations in Gremlin Server.
port 
The port value identifies the available communications port for Gremlin Server. Default: 8182
threadPoolWorker 
The number of worker threads that handle requests and responses on the Gremlin Server channel, including routing requests to the right server operations, handling scheduled jobs on the server, and writing serialized responses back to the client. Default: 2
gremlinPool 
The number of Gremlin threads available to execute actual scripts in a ScriptEngine. This pool represents the workers available to handle blocking operations in Gremlin Server. Default: 8
maxContentLength 
The maximum length of a block of content passed between the Gremlin console and the Gremlin Server. If the content length exceeds this value, the transfer encoding of the decoded request is converted to chunked and the content is split into multiple HttpContent objects. Default: 65536000
maxChunkSize 
The maximum chunk size passed between the Gremlin console and the Gremlin Server. If the transfer encoding of the HTTP request is already chunked and the length of the chunk exceeds this value, each chunk is split into smaller chunks. Default: 4096000
maxInitialLineLength 
The maximum length of the initial line that is processed in a request, for example GET / HTTP/1.0. This value controls the maximum length of the submitted URI. Default: 4096
maxHeaderSize 
The maximum length of all headers. Default: 8192
maxAccumulationBufferComponents 
The maximum number of request components that can be aggregated for a message. Default: 1024
resultIterationBatchSize 
The size that the result of a request is batched back to the client. For example, for a value of 1, a result with 10 items sends results individually. For a value of 2, a result with 10 items in sent in 5 batches of 2 each. This value can be overridden per request. Default: 64
useEpollEventLoop 
Linux only: Try to use epoll event loops instead of netty NIO. Default: false
serializers 
The class and configuration for the serializers that allow the DataStax Enterprise driver and the Gremlin driver to communicate with the Gremlin Server.
metrics 
Uncomment and adjust settings to turn on and configure metrics gathering functionality and reporting options:
# metrics: {
#  consoleReporter: {enabled: true, interval: 180000},
#  csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
#  jmxReporter: {enabled: true},
 #  slf4jReporter: {enabled: true, interval: 180000},
#  gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
#  graphiteReporter: {enabled: false, interval: 180000}}

The Gremlin Server can communicate using different language variants of Gremlin. In the dse.yaml file, gremlin-groovy is configured by default.

The location of the dse.yaml file depends on the type of installation:
Installer-Services /etc/dse/dse.yaml
Package installations /etc/dse/dse.yaml
Installer-No Services install_location/resources/dse/conf/dse.yaml
Tarball installations install_location/resources/dse/conf/dse.yaml