DSE Analytics and Search integration

DSE SearchAnalytics clusters can use DSE Search queries within DSE Analytics jobs.

An integrated DSE SearchAnalytics cluster allows analytics jobs to be performed using search queries. This integrated cluster allows finer-grained control over the types of queries used in analytics workloads, and better performance because the amount of data that is processed is reduced.

Nodes in SearchAnalytics mode allow you to create analytics queries that use DSE Search indexes. These queries return RDDs that are used by Spark jobs to analyze the returned data.

The following code shows how to use a DSE Search query from the DSE Spark console.

val table = sc.cassandraTable("music","solr")
val result = table.select("id","artist_name").where("solr_query='artist_name:Miles*'").collect

For a more complete example, see Running the Wikipedia demo with SearchAnalytics.

Planning a DSE SearchAnalytics cluster 

DSE SearchAnalytics clusters should be created as a new cluster in a data center, as described in Single data center deployment per workload type. The name of the data center is set to SearchAnalytics when using the DseSimpleSnitch. Do not modify existing search or analytics nodes to be SearchAnalytics nodes.

SearchAnalytics nodes might consume more resources than search or analytics nodes. Because the resource requirements of the nodes greatly depends on the type of query patterns you are using, we recommend doing load-testing to ensure your hardware has enough CPU and memory for the additional resource overhead required by Spark and Solr.

Limitations of DSE SearchAnalytics clusters 

While you will be able to query Solr indexes from Spark in a SearchAnalytics data center, you will get none of the benefits of workload isolation in that data center. DataStax recommends that you do not run real-time DSE Search queries against SearchAnalytics nodes if Spark jobs are being run on them.

SearchAnalytics clusters are considered experimental, and should not be run in production environments.