Architecture

A brief overview and illustration of the DSE search architecture.

When you update a table using CQL, the Solr document is updated. Re-indexing occurs automatically after an update.



Writes are durable. A Solr API client writes data to Cassandra first, and then Cassandra updates indexes. All writes to a replica node are recorded both in memory and in a commit log before they are acknowledged as a success. If a crash or server failure occurs before the memory tables are flushed to disk, the commit log is replayed on restart to recover any lost writes.

The commit log replaces the Solr updatelog, which is not supported in DSE Search/Solr. Consequently, features that require the updateLog are not supported:

If you still want to use the update log, configure the updateLog in the solrconfig.xml using the force="true" attribute.

When you update a table using CQL, the Solr document is updated. Re-indexing occurs automatically after an update. Writes are durable. All writes to a replica node are recorded in memory and in a commit log before they are acknowledged as a success. If a crash or server failure occurs before the memory tables are flushed to disk, the commit log is replayed on restart to recover any lost writes.

Note: DSE Search does not support JBOD mode.

DSE Search terms

In DSE Search, there are several names for an index of documents and configuration on a single node:
  • A Solr core
  • A collection
  • One shard of a collection

Each document in a Solr core/collection is considered unique and contains a set of fields that adhere to a user-defined schema. The schema lists the field types and how they should be indexed. DSE Search maps Solr cores/collections to Cassandra tables. Each table has a separate Solr core/collection on a particular node. Solr documents are mapped to Cassandra rows, and document fields to columns. The shard is analogous to a partition of the table. The Cassandra keyspace is a prefix for the name of the Solr core/collection and has no counterpart in Solr.

This table shows the relationship between Cassandra and Solr concepts:

Cassandra Solr single node environment
Table Solr core or collection
Row Document
Primary key Unique key
Column Field
Node N/A
Partition N/A
Keyspace N/A

With Cassandra replication, a Cassandra node or Solr core contains more than one partition (shard) of table (collection) data. Unless the replication factor equals the number of cluster nodes, the Cassandra node or Solr core contains only a portion of the data of the table or collection.

Note: Do not mix Solr indexes with Cassandra secondary indexes. Attempting to use both indexes on the same table is not supported.