DSE Search versus Open Source Solr
Differences between DSE Search and Open Source Solr (OSS).
By virtue of its integration into DataStax Enterprise, differences exist between DSE Search and Open Source Solr (OSS).
Major differences
The major differences in capabilities are:
Capability | DSE Search | OS Solr | Description |
---|---|---|---|
Includes a database | yes | no | A user has to create an interface to add a database to OSS. |
Indexes real-time data | yes | no | Cassandra ingests real-time data and Solr indexes the data. |
Provides an intuitive way to update data | yes | no | DataStax provides a SQL-like language and command-line shell, CQL, for loading and updating data. Data added to Cassandra shows up in Solr |
Indexes Hadoop output without ETL | yes | no | Cassandra ingests the data, Solr indexes the data, and you run MapReduce against that data in one cluster. |
Supports data distribution | yes | yes [1] | DataStax Enterprise distributes Cassandra real-time, Hadoop, and Solr data to multiple nodes in a cluster transparently. |
Balances loads on nodes/shards | yes | no | Unlike Solr and Solr Cloud loads can be rebalanced efficiently. |
Spans indexes over multiple data centers | yes | no | A cluster can have more than one data center for different types of nodes. |
Automatically re-indexes Solr data | yes | no | The only way to re-index data in Solr is to have the client re-ingest everything. |
Stores data added through Solr in Cassandra | yes | no | Data updated using the Solr API shows up in Cassandra. |
Makes durable updates to data | yes | no | Updates are durable and written to the Cassandra commit log regardless of how the update is made. |
Upgrades of Lucene preserve data | yes | no | DataStax integrates Lucene upgrades periodically and when you upgrade DSE, data is preserved. Solr users must re-ingest all their data after upgrading to Lucene. |
Security | yes | no | DataStax has extended SolrJ to protect internal communication and HTTP access. Solr data can be encrypted and audited. For example, use Kerberos or SSL security for a DSE instance and then run secure queries of that DSE instance by using CQL or HTTP. |
[1] Requires using Zookeeper.