Customizing the search index schema

A search schema defines the relationship between data in a table and a search index. The schema identifies the columns to index and maps column names to Apache Solr™ types.

Schema defaults

DSE Search automatically maps the CQL column type to the corresponding Solr field type, defines the field type analyzer and filtering classes, and sets the DocValue.

If required, modify the schema using the CQL-Solr type compatibility matrix.

Table and schema definition

Fields with indexed="true" are indexed and stored as secondary files in Lucene so that the fields are searchable. The indexed fields are stored in the database, not in Lucene, with the exception of copy fields. Copy field destinations are not stored in the database.

To set field values as lowercase and have them stored as lowercase in docValues, use the custom LowerCaseStrField type. Refer to Using LowerCaseStrField with search indexes.

Sample schema

The following example from Querying CQL collections uses a simple primary key. The schema version attribute is the Solr version number for the schema syntax and semantics. In this example, version="1.5".

<schema name="my_search_demo" version="1.5">
  <types>
    <fieldType class="solr.StrField" multiValued="true" name="StrCollectionField"/>
    <fieldType name="string" class="solr.StrField"/>
    <fieldType name="text" class="solr.TextField"/>
    <fieldType class="solr.TextField" name="textcollection" multiValued="true">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
      </analyzer>
    </fieldType>
  </types>
  <fields>
    <field name="id"  type="string" indexed="true"/>
    <field name="quotes"  type="textcollection" indexed="true"/>
    <field name="name"  type="text" indexed="true"/>
    <field name="title"  type="text" indexed="true"/>
  </fields>
  <defaultSearchField>quotes</defaultSearchField>
  <uniqueKey>id</uniqueKey>
</schema>

DSE Search indexes the id, quotes, name, and title fields.

Mapping CQL primary keys and Solr unique keys

DSE Search supports CQL tables using simple or compound primary keys.

If the field is a compound primary key or Defining a multi-column partition key column in the database, the unique key value is enclosed parentheses. The schema for this kind of table requires a different syntax than the simple primary key:

  • List each compound primary key column that appears in the CQL table in the schema as a field, just like any other column.

  • Declare the unique key using the key columns enclosed in parentheses.

  • Order the keys in the uniqueKey element as the keys are ordered in the CQL table.

  • When using composite partition keys, do not include the extra set of parentheses in the uniqueKey.

    Partition key CQL syntax Solr uniqueKey syntax

    Simple CQL primary key

    CREATE TABLE ( . . . a <type> PRIMARY KEY, . . . );(a is both the partition key and the primary key)

    <uniqueKey>a</uniqueKey>

    Parenthesis are not required for a single key.

    Compound primary key

    CREATE TABLE ( . . . PRIMARY KEY ( a, b, c ) );(a is the partition key and a b c is the primary key)

    <uniqueKey>(a, b, c)</uniqueKey>

    Composite partition key

    CREATE TABLE ( . . . PRIMARY KEY ( ( a, b), c );(a b is the partition key and a b c is the primary key)

    <uniqueKey>(a, b, c)</uniqueKey>

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com