About DSE Search

DSE Search is part of DataStax Enterprise (DSE). DSE Search allows you to find data and create features like product catalogs, document repositories, and ad-hoc reports. See DSE Search architecture.

DSE Analytics and Search integration and DSE Analytics can use the indexing and query capabilities of DSE Search. DSE Search manages search indexes with a persistent store.

The benefits of running enterprise search functions through DataStax Enterprise and DSE Search include:

  • DSE Search is backed by a scalable database.

  • A persistent store for search indexes.

  • A fault-tolerant search architecture across multiple datacenters.

  • Add search capacity just like you add capacity in the DSE database.

  • Set up replication for DSE Search nodes the same way as other nodes by creating a keyspace or changing the replication factor of a keyspace to optimize performance.

  • DSE Search has two indexing modes: Near-real-time (NRT) and live indexing, also called real-time (RT) indexing. Configure and tune DSE Search for maximum indexing throughput.

  • Near real-time query capabilities.

  • TDE encryption of DSE Search data, including search indexes and commit logs. See Encrypting Search indexes.

  • CQL index management commands simplify search index management.

  • Local node (optional) management of search indexing resources with dsetool commands.

  • Read/write to any DSE Search node and automatically index stored data.

  • Examine and aggregate real-time data using CQL.

  • Fault-tolerant queries, efficient deep paging, and advanced search node resiliency.

  • Virtual nodes (vnodes) support.

  • Set the location of the search index.

  • Native CQL queries that leverage search indexes for an array of CQL query functionality and indexing support.

  • Using CQL, DSE Search supports partial document updates that enable you to modify existing information and maintain a lower transaction cost.

  • Supports indexing and querying of advanced data types, including tuples and user-defined types (UDT). For more, see CQL data types.

  • DSE Search is built with a production-certified version of Apache Solr™. DSE Search uses some Solr tools and APIs, the implementation does not guarantee that Solr tools and APIs work as expected. Be sure to review the unsupported features for DSE Search.

See the DataStax blog post What’s New for Search in DSE 6. Highlights include:

  • Simplified indexing pipeline and back-pressure that reduces the frequency of dropped mutations and requires less configuration. (Soft commit is still required for update visibility.)

  • NodeSync search data and data repair is processed automatically by DSE.

  • Native CQL queries can use search indexes for additional CQL query functionality and index support. Search queries do not require a solr_query clause, and some queries that previously required ALLOW FILTERING no longer have that limitation because search indexes are used automatically.

  • Query LIKE operator can be used with search indexes.

  • Default search index configuration provides functionality similar to the ANSI SQL LIKE operator, and requires less processing to generate the data and less index data for the search.

  • Disabled the ability to perform writes and deletes using the Solr HTTP interface.

  • Additional logging for shard replica requests to improve troubleshooting.

  • Default index behavior from Cassandra is overridden to improve the performance of post-repair index building.

About HTTP Basic Authentication and DSE Search clusters

Only use HTTP Basic Authentication with DSE Search clusters in your testing and development environments. Do not use internal authentication on DSE Search clusters in production.

You can use HTTP Basic Authentication with DSE Search clusters, however it’s not recommended for production.

To secure DSE Search in production, enable DataStax Enterprise Kerberos authentication, or search using CQL.

If instead you enable Cassandra internal authentication, by specifying authenticator: org.apache.Cassandra.auth.PasswordAuthenticator in cassandra.yaml, clients must use HTTP Basic Authentication to provide credentials to Solr services. Due to the stateless nature of HTTP Basic Authentication, this option can have a significant performance impact because the authentication process must be executed on each HTTP request. For this reason, DataStax does not recommend using internal authentication on DSE Search clusters in production.

DSE Search versus Open Source Apache Solr

Differences between DSE Search and Open Source Solr (OSS).

Unsupported features for DSE Search

Unsupported Apache Cassandra, Apache Solr, and other features.

Apache Solr and Apache Lucene limitations

The Apache Solr and Apache Lucene limitations that apply to DSE Search.

Limitations

When issuing a filter query (fq) using the frange function, such as:

transaction_date:{!frange cost=200 l=NOW/DAY-179DAYS u=NOW/DAY+1DAY incl=true incu=false}transaction_date

The NOW placeholder does not expand into its actual value in the context of the filter cache. As a result, the same filter cache key is used for queries that are based on different NOW values:

"+FunctionRangeQuery(ConstantScore(frange(date(transaction_date)):[NOW/DAY-179DAYS TO NOW/DAY+1DAY]))",

compositefilter(positiveQueries=[+ConstantScore(frange(date(value)):[NOW/DAY-179DAYS TO NOW/DAY+1DAY])],negativeQueries=[_parent_:F])

This behavior results in a collision of entries in the filter cache.

To work around this limitation, users can:

  1. Use the […​ TO …​] syntax for ranges instead of frange

  2. Use the cache=false parser parameter instead of caching queries with frange

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com