About DSE Search

DSE Search is part of DataStax Enterprise (DSE). DSE Search allows you to find data and create features like product catalogs, document repositories, and ad-hoc reports. See DSE Search architecture.

DSE Analytics and Search integration and DSE Analytics can use the indexing and query capabilities of DSE Search. DSE Search manages search indexes with a persistent store.

DSE Search integrates Apache Solr™ 6.0.1 to manage search indexes with a persistent store.

The benefits of running enterprise search functions through DataStax Enterprise and DSE Search include:

  • DSE Search is backed by a scalable database.

  • A persistent store for search indexes.

  • A fault-tolerant search architecture across multiple datacenters.

  • Add search capacity just like you add capacity in the DSE database.

  • Set up replication for DSE Search nodes the same way as other nodes by creating a keyspace or changing the replication factor of a keyspace to optimize performance.

  • DSE Search has two indexing modes: Near-real-time (NRT) and live indexing, also called real-time (RT) indexing. Configure and tune DSE Search for maximum indexing throughput.

  • Near real-time query capabilities.

  • TDE encryption of DSE Search data, including search indexes and commit logs.

  • CQL index management commands simplify search index management.

  • Local node (optional) management of search indexing resources with dsetool commands.

  • Read/write to any DSE Search node and automatically index stored data.

  • Examine and aggregate real-time data using CQL.

  • Fault-tolerant queries, efficient deep paging, and advanced search node resiliency.

  • Virtual nodes (vnodes) support.

  • Set the location of the search index.

  • Using CQL, DSE Search supports partial document updates that enable you to modify existing information while maintaining a lower transaction cost.

  • Supports indexing and querying of advanced data types, including tuples and User-defined type (UDT).

  • Supports all Solr tools and APIs, with several specific unsupported features.

Solr resources

Resources for more information on using Open Source Solr (OSS):

About HTTP Basic Authentication and DSE Search clusters

Only use HTTP Basic Authentication with DSE Search clusters in your testing and development environments. Do not use internal authentication on DSE Search clusters in production.

You can use HTTP Basic Authentication with DSE Search clusters, however it’s not recommended for production.

To secure DSE Search in production, enable DataStax Enterprise Kerberos authentication, or search using CQL.

If instead you enable Cassandra internal authentication, by specifying authenticator: org.apache.Cassandra.auth.PasswordAuthenticator in cassandra.yaml, clients must use HTTP Basic Authentication to provide credentials to Solr services. Due to the stateless nature of HTTP Basic Authentication, this option can have a significant performance impact because the authentication process must be executed on each HTTP request. For this reason, DataStax does not recommend using internal authentication on DSE Search clusters in production.

Limitations

When issuing a filter query (fq) using the frange function, such as:

transaction_date:{!frange cost=200 l=NOW/DAY-179DAYS u=NOW/DAY+1DAY incl=true incu=false}transaction_date

The NOW placeholder does not expand into its actual value in the context of the filter cache. As a result, the same filter cache key is used for queries that are based on different NOW values:

"+FunctionRangeQuery(ConstantScore(frange(date(transaction_date)):[NOW/DAY-179DAYS TO NOW/DAY+1DAY]))",

compositefilter(positiveQueries=[+ConstantScore(frange(date(value)):[NOW/DAY-179DAYS TO NOW/DAY+1DAY])],negativeQueries=[_parent_:F])

This behavior results in a collision of entries in the filter cache.

To work around this limitation, users can:

  1. Use the […​ TO …​] syntax for ranges instead of frange

  2. Use the cache=false parser parameter instead of caching queries with frange

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com