Using the Northwind demo graph with Spark OLAP jobs
Run OLAP queries against the Northwind demo graph data.
The Northwind demo included with the DSE demos has a script for creating a graph of the data for a fictional trading company.
In this task, you'll use the Gremlin console to create the Northwind graph, snapshot part of the graph, and run a count operation on the subgraph using the SparkGraphComputer.
Prerequisites
- Enable DSE Graph, DSE Search, and DSE Analytics modes in your datacenter.
- Install the DSE Graph Loader.
- Clone the graph-examples Git repository to the machine on
                    which you are running the Gremlin
                    console.git clone https://github.com/datastax/graph-examples.git 
Procedure
- 
                Load the Northwind graph and supplemental data using the
                        graphloadertool:graphloader -graph northwind -address localhost graph-examples/northwind/northwind-mapping.groovy -inputpath graph-examples/northwind/data && graphloader -graph northwind -address localhost graph-examples/northwind/supplemental-data-mapping.groovy -inputpath graph-examples/northwind/data/ 
- 
                Start the Gremlin console using the dse gremlin-console
                    command:
                dse gremlin-console 
- 
                Alias the traversal to Northwind graph using the default OLTP traversal
                    source:
                :remote config alias g northwind.g 
- 
                Set the schema mode to Development.To allow modifying the schema for the connected graph database, you must set the mode to Developmenteach session. The default schema mode for DSE Graph isProduction, which doesn't allow you to modify the graph's schema.schema.config().option('graph.schema_mode').set('Development')
- 
                Enable the use of scans and lambdas.
                schema.config().option('graph.allow_scan').set('true') graph.schema().config().option('graph.traversal_sources.g.restrict_lambda').set(false)
- 
                Look at the schema of the northwindgraph:schema.describe() 
- 
                Alias the traversal to the Northwind analytics OLAP traversal source
                        a. Aliasgto the OLAP traversal source for one-off analytic queries::remote config alias g northwind.a ==>g=northwind.a 
- 
                Count the number of vertices using the OLAP traversal source:
                g.V().count() ==>3294 When you alias gto the OLAP traversal sourcedatabase name.a, DSE Analytics is the workload back-end.
- 
                Store subgraphs into snapshots using graph.snapshot().
                When you need to run multiple OLAP queries on a graph in one session, use snapshots of the graph as the traversal source.employees = graph.snapshot().vertices('employee').create()==>graphtraversalsource[hadoopgraph[persistedinputrdd->persistedoutputrdd], sparkgraphcomputer] categories = graph.snapshot().vertices('category').create()==>graphtraversalsource[hadoopgraph[persistedinputrdd->persistedoutputrdd], sparkgraphcomputer] The snapshot() method returns an OLAP traversal source using the SparkGraphComputer. 
- 
                Run an operation on the snapshot graphs.
                Count the number of employee vertices in the snapshot graph: employees.V().count() ==> 9 Count the number of category vertices in the snapshot graph: categories.V().count() ==> 8 
