About DSE Search
DSE Search is part of DataStax Enterprise (DSE). DSE Search allows you to find data and create features like product catalogs, document repositories, and ad-hoc reports. See DSE Search architecture.
DSE Analytics and Search integration and DSE Analytics can use the indexing and query capabilities of DSE Search. DSE Search manages search indexes with a persistent store.
DSE Search integrates Apache Solr™ 6.0.1 to manage search indexes with a persistent store.
The benefits of running enterprise search functions through DataStax Enterprise and DSE Search include:
-
DSE Search is backed by a scalable database.
-
A persistent store for search indexes.
-
A fault-tolerant search architecture across multiple datacenters.
-
Add search capacity just like you add capacity in the DSE database.
-
Set up replication for DSE Search nodes the same way as other nodes by creating a keyspace or changing the replication factor of a keyspace to optimize performance.
-
DSE Search has two indexing modes: Near-real-time (NRT) and live indexing, also called real-time (RT) indexing. Configure and tune DSE Search for maximum indexing throughput.
-
Near real-time query capabilities.
-
TDE encryption of DSE Search data, including search indexes and commit logs.
-
CQL index management commands simplify search index management.
-
Local node (optional) management of search indexing resources with
dsetool
commands. -
Read/write to any DSE Search node and automatically index stored data.
-
Examine and aggregate real-time data using CQL.
-
Fault-tolerant queries, efficient deep paging, and advanced search node resiliency.
-
Virtual nodes (vnodes) support.
-
Set the location of the search index.
-
Using CQL, DSE Search supports partial document updates that enable you to modify existing information while maintaining a lower transaction cost.
-
Supports indexing and querying of advanced data types, including tuples and User-defined type (UDT).
-
Supports all Solr tools and APIs, with several specific unsupported features.
Solr resources
Resources for more information on using Open Source Solr (OSS):
-
Solr cell project, including a tool for importing data from PDFs
About HTTP Basic Authentication and DSE Search clusters
Only use HTTP Basic Authentication with DSE Search clusters in your testing and development environments. Do not use internal authentication on DSE Search clusters in production. |
You can use HTTP Basic Authentication with DSE Search clusters, however it’s not recommended for production.
To secure DSE Search in production, enable DataStax Enterprise Kerberos authentication, or search using CQL.
If instead you enable Cassandra internal authentication, by specifying authenticator: org.apache.Cassandra.auth.PasswordAuthenticator
in cassandra.yaml
, clients must use HTTP Basic Authentication to provide credentials to Solr services.
Due to the stateless nature of HTTP Basic Authentication, this option can have a significant performance impact because the authentication process must be executed on each HTTP request. For this reason, DataStax does not recommend using internal authentication on DSE Search clusters in production.
Limitations
When issuing a filter query (fq) using the frange
function, such as:
transaction_date:{!frange cost=200 l=NOW/DAY-179DAYS u=NOW/DAY+1DAY incl=true incu=false}transaction_date
The NOW
placeholder does not expand into its actual value in the context of the filter cache.
As a result, the same filter cache key is used for queries that are based on different NOW
values:
"+FunctionRangeQuery(ConstantScore(frange(date(transaction_date)):[NOW/DAY-179DAYS TO NOW/DAY+1DAY]))",
compositefilter(positiveQueries=[+ConstantScore(frange(date(value)):[NOW/DAY-179DAYS TO NOW/DAY+1DAY])],negativeQueries=[_parent_:F])
This behavior results in a collision of entries in the filter cache.
To work around this limitation, users can:
-
Use the
[… TO …]
syntax for ranges instead offrange
-
Use the
cache=false
parser parameter instead of caching queries withfrange