Tuning index size and range query speed

In DataStax Enterprise, you can trade off search index size for range query speed and vice versa. You make this tradeoff to suit a particular use case and on a core-by-core basis by setting up the precision step of two special token field types that are used by DataStax Enterprise.

Use extreme care when performing this tuning. This advanced tuning feature is recommended for use in rare cases. In most cases, using the default values is the best. To perform this tuning, you change the precision step of one or both DataStax Enterprise internal field types:

  • token_long

    Used for filtering over token fields during query routing.

  • ttl_long

    Used for searching for expiring documents.

To change the precision step:

  1. In the fieldType definition, set the class attribute of token_long and ttl_long to solr.TrieLongField.

  2. Set the precisionStep attribute from the default 8 to another number. Choose this number based on an understanding of its impact. Usually, a smaller precision step increases the index size and range query speed, while a larger precision step reduces index size, but potentially reduces range query speed.

    The following snippet of the schema.xml shows an example of the required configuration of both field types:

    <?xml version="1.0" encoding="UTF-8" ?>
    <schema name="test" version="1.0">
            . . .
           <fieldType name="token_long" class="solr.TrieLongField" precisionStep="16" />
    	<fieldType name="ttl_long" class="solr.TrieLongField" precisionStep="16" />
    	 . . .
    . . .

DataStax Enterprise ignores one or both of these field type definitions and uses the default precision step if you make any of these mistakes:

  • The field type is defined using a name other than token_long or ttl_long.

  • The class is something other than solr.TrieLongField.

  • The precision step value is not a number. DataStax Enterprise logs a warning.

The definition of a fieldType alone sets up the special field. You do not need to use token_long or ttl_long types as fields in the <fields> tag.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com