Running the Wikipedia demo with SearchAnalytics
The following instructions describe how to use search queries in the Spark console on SearchAnalytics
nodes using the Wikipedia demo.
You must have created a new SearchAnalytics
datacenter as described in the single datacenter deployment scenario.
-
Start the node or nodes in
SearchAnalytics
mode.-
Packages/Services: See Starting DataStax Enterprise as a service.
-
Tarball/No Services: See Starting DataStax Enterprise as a stand-alone process.
-
-
Ensure that the cluster is running correctly by running
dsetool ring
. The node type should beSearchAnalytics
.Package and Installer-Services installations:
dsetool ring
Tarball and Installer-No Services installations:`installation_location/bin/dsetool ring`
-
In a terminal, go to the Wikipedia demo directory.
The default wikipedia demo location depends on the type of installation:
-
Package installations and Installer-Services:
/usr/share/dse/demos/wikipedia
-
Tarball installations and Installer-No Services:
installation_location/demos/wikipedia
$ cd /usr/share/dse/demos/wikipedia
-
-
Add the schema by running the
1-add-schema.sh
script.$ ./1-add-schema.sh
-
Create the search indexes.
$ ./2-index.sh
-
Start the Spark console.
$ dse spark
-
Create an RDD based on the
wiki.solr
table.$ scala> val table = sc.cassandraTable("wiki","solr")
$ table: com.datastax.spark.connector.rdd.CassandraTableScanRDD[com.datastax.spark.connector.CassandraRow] = CassandraTableScanRDD[0] at RDD at CassandraRDD.scala:15
-
Run a query using the title Solr index and collect the results.
$ scala> val result = table.select("id","title").where("solr_query='title:Boroph*'").collect
Equivalent JSON query:
$ where("solr_query='{"q": "title:Boroph*"}'")
result: Array[com.datastax.spark.connector.CassandraRow] = Array( CassandraRow{id: 23729958, title: Borophagus parvus}, CassandraRow{id: 23730195, title: Borophagus dudleyi}, CassandraRow{id: 23730528, title: Borophagus hilli}, CassandraRow{id: 23730810, title: Borophagus diversidens}, CassandraRow{id: 23730974, title: Borophagus littoralis}, CassandraRow{id: 23731282, title: Borophagus orc}, CassandraRow{id: 23731616, title: Borophagus pugnator}, CassandraRow{id: 23732450, title: Borophagus secundus})
For details on using search query syntax in CQL, see Search index syntax.