Configuring search index joins
Modify the search index schema to enable or disable search index joins across tables.
DataStax Enterprise supports solr_query
joins on the partition key field
(_partitionKey
). By default, the solr_query
join functionality
is enabled and DSE indexes the partitioning columns in this additional field. This field,
_partitionKey
, increases search index size. Disabling joins can decrease the
amount of disk space the search indexes uses.
Join settings in the schema
DESCRIBE ACTIVE SEARCH INDEX
SCHEMA
displays the schema settings of a search index. DSE hides the
definition of the _partitionKey
when joins are enabled.
_partitionKey
, support for joins is: - Enabled: attributes
docValues
andindexed
are set to true. For example:<field name="_partitionKey" docValues="true" indexed="true" stored="false" type="StrField"/>
. - Disabled: attributes
docValues
andindexed
are set to false. For example:<field docValues="false" indexed="false" multiValued="false" name="_partitionKey" omitNorms="true" stored="false" type="StrField"/>
_partitionKey
,
then joins are enabled.Prerequisite
This section uses the Term and phrase searches using the wikipedia demo.
Disable joins
Set the _partitionKey
field attributes indexed
and
docValues
to false.
Disable join on a search index by setting the _partitionKey
field
attributes indexed
and docValues
to false in the
schema.
Procedure
-
Verify if schema has the field
_partitionKey
and fieldTypeStrField
definitions.DESCRIBE ACTIVE SEARCH INDEX SCHEMA ON wiki.solr;
The example search index has joins enabled with no_partitionKey
definition:<?xml version="1.0" encoding="UTF-8" standalone="no"?> <schema name="autoSolrSchema" version="1.5"> <types> <fieldType class="org.apache.solr.schema.TextField" name="TextField"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> <fieldType class="org.apache.solr.schema.TrieDateField" name="TrieDateField"/> <fieldType class="org.apache.solr.schema.StrField" name="StrField"/> </types> <fields> <field indexed="true" multiValued="false" name="body" stored="true" type="TextField"/> <field docValues="true" indexed="true" multiValued="false" name="real_date" stored="true" type="TrieDateField"/> <field indexed="true" multiValued="false" name="title" stored="true" type="TextField"/> <field indexed="true" multiValued="false" name="id" stored="true" type="StrField"/> <field indexed="true" multiValued="false" name="date" stored="true" type="TextField"/> </fields> <uniqueKey>id</uniqueKey> </schema>
-
If required, add the string type definition:
ALTER SEARCH INDEX SCHEMA ON wiki.solr ADD types.fieldType[@class='org.apache.solr.schema.StrField', @name='StrField'];
The definition is added to the pending schema and is not immediately applied. -
Define the partition key field:
- If the search index already has the partition key field, change the
indexed
anddocValues
to false:ALTER SEARCH INDEX SCHEMA ON wiki.solr SET field[@name='_partitionKey']@docValues='false'; ALTER SEARCH INDEX SCHEMA ON wiki.solr SET field[@name='_partitionKey']@indexed='false';
- If the schema does not have a
_partitionKey
definition, add one to override the default settings:ALTER SEARCH INDEX SCHEMA ON wiki.solr ADD fields.field[@name='_partitionKey', @type='StrField', @docValues='false', @indexed='false'];
Note: The type definition StrField is also required.
- If the search index already has the partition key field, change the
-
Verify that the schema definition was correctly modified:
DESCRIBE PENDING SEARCH INDEX SCHEMA ON wiki.solr;
For example, a simple table with three fields and a single partition key:<?xml version="1.0" encoding="UTF-8" standalone="no"?> <schema name="autoSolrSchema" version="1.5"> <types> <fieldType class="org.apache.solr.schema.TextField" name="TextField"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> <fieldType class="org.apache.solr.schema.TrieDateField" name="TrieDateField"/> <fieldType class="org.apache.solr.schema.StrField" name="StrField"/> </types> <fields> <field indexed="true" multiValued="false" name="body" stored="true" type="TextField"/> <field docValues="true" indexed="true" multiValued="false" name="real_date" stored="true" type="TrieDateField"/> <field indexed="true" multiValued="false" name="title" stored="true" type="TextField"/> <field indexed="true" multiValued="false" name="id" stored="true" type="StrField"/> <field indexed="true" multiValued="false" name="date" stored="true" type="TextField"/> <field docValues="false" indexed="false" name="_partitionKey" type="StrField"/> </fields> <uniqueKey>id</uniqueKey> </schema>
-
Reload the schema to make it active:
RELOAD SEARCH INDEX ON wiki.solr;
-
Optional, rebuild the search index:
REBUILD SEARCH INDEX ON wiki.solr;
Note: Rebuilding from CQL regenerates the index from the existing data on all search nodes, which use significant resources and is not required when disabling joins. When no rebuild command is executed after a schema change, new data in the field is not be duplicated and indexed. Use dsetool rebuild_indexes to regenerate the index on a node-by-node basis.
Enable joins
Enable join on a previously disabled search index.
_partitionKey
, docValues
, and
indexed
attributes to true
, reload the schema, and
rebuild the index. cqlsh
. Before launching cqlsh,
you can override the timeout. See Adjusting timeout for index management.Procedure
- Start cqlsh on a node that is running DSE Search.
-
Set the
docValues
andindexed
attributes totrue
:ALTER SEARCH INDEX SCHEMA ON wiki.solr SET field[@name='_partitionKey']@docValues='true'; ALTER SEARCH INDEX SCHEMA ON wiki.solr SET field[@name='_partitionKey']@indexed='true';
-
Verify that the schema definition was correctly modified:
DESCRIBE PENDING SEARCH INDEX SCHEMA ON wiki.solr;
For example, a simple table with three fields and a single partition key:<?xml version="1.0" encoding="UTF-8" standalone="no"?> <schema name="autoSolrSchema" version="1.5"> <types> <fieldType class="org.apache.solr.schema.TextField" name="TextField"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> <fieldType class="org.apache.solr.schema.TrieDateField" name="TrieDateField"/> <fieldType class="org.apache.solr.schema.StrField" name="StrField"/> </types> <fields> <field indexed="true" multiValued="false" name="body" stored="true" type="TextField"/> <field docValues="true" indexed="true" multiValued="false" name="real_date" stored="true" type="TrieDateField"/> <field indexed="true" multiValued="false" name="title" stored="true" type="TextField"/> <field indexed="true" multiValued="false" name="id" stored="true" type="StrField"/> <field indexed="true" multiValued="false" name="date" stored="true" type="TextField"/> <field docValues="false" indexed="false" name="_partitionKey" type="StrField"/> </fields> <uniqueKey>id</uniqueKey> </schema>
-
Reload the schema to make it active:
RELOAD SEARCH INDEX ON wiki.solr;
-
Rebuild the search index:
REBUILD SEARCH INDEX ON wiki.solr;