Example: copy fields and docValues

This example uses copy fields to copy various aliases, such as a twitter name and email alias, to a multivalue field. You can then query the multivalue field using any alias as the term to get the other aliases in the same row or rows as the term.

This example uses copy fields to copy various aliases, such as a twitter name and email alias, to a multivalue field. You can then query the multivalue field using any alias as the term to get the other aliases in the same row or rows as the term.

Step 9 covers how to see information about the per-segment field cache and filter cache. DataStax Enterprise moves the DSE per-segment filter cache off-heap by using native memory, hence reducing on-heap memory consumption and garbage collection overhead. The off-heap filter cache is enabled by default, but can be disabled by passing the following JVM system property at startup time: -Dsolr.offheap.enable=false.

Procedure

  1. If you did not already create a directory named solr_tutorial46 that contains a schema.xml and solrconfig.xml, do so now. You can use the schema.xml and solrconfig.xml from the demos/wikipedia directory by copying these files to solr_tutorial46.
  2. Using CQL, create a keyspace and a table to store user names, email addresses, and their skype, twitter, and irc names. The all field will exist in the Solr index only, so you do not need an all column in the table.
    CREATE KEYSPACE user_info
      WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
    
    CREATE TABLE user_info.users (
      id text PRIMARY KEY,
      name text,
      email text,
      skype text,
      irc text,
      twitter text
    ) ;
  3. Run a CQL BATCH command, as explained earlier, if the schema includes a multivalue field.
    BEGIN BATCH
      INSERT INTO user_info.users (id, name, email, skype, irc, twitter) VALUES
      ('user1', 'john smith', 'jsmith@abc.com', 'johnsmith', 'smitty', '@johnsmith')
    
      INSERT INTO user_info.users (id, name, email, skype, irc, twitter) VALUES
      ('user2', 'elizabeth doe', 'lizzy@swbell.net', 'roadwarriorliz', 'elizdoe',  '@edoe576')
    
      INSERT INTO user_info.users (id, name, email, skype, irc, twitter) VALUES
      ('user3', 'dan graham', 'etnaboy1@aol.com', 'danielgra', 'dgraham', '@dannyboy')
    
      INSERT INTO user_info.users (id, name, email, skype, irc, twitter) VALUES
      ('user4', 'john smith', 'jonsmit@fyc.com', 'johnsmith', 'jsmith345', '@johnrsmith')
    
     INSERT INTO user_info.users (id, name, email, skype, irc, twitter) VALUES
      ('user5', 'john smith', 'jds@adeck.net', 'jdsmith', 'jdansmith',  '@smithjd999')
    
     INSERT INTO user_info.users (id, name, email, skype, irc, twitter) VALUES
      ('user6', 'dan graham', 'hacker@legalb.com', 'dangrah', 'dgraham', '@graham222')
    
    APPLY BATCH;
  4. Use a schema that contains the multivalued field--all, copy fields for each alias plus the user id, and a docValues option.
    <schema name="my_search_demo" version="1.5">
      <types>
        <fieldType name="string" class="solr.StrField"/>
        <fieldType name="text" class="solr.TextField">
          <analyzer>
            <tokenizer class="solr.StandardTokenizerFactory"/>
          </analyzer>
        </fieldType>
      </types>
      <fields>
        <field name="id"  type="string" indexed="true"  stored="true"/>
        <field name="name"  type="string" indexed="true"  stored="true"/>
        <field name="email" type="string" indexed="true" stored="true"/>
        <field name="skype" type="string" indexed="true"  stored="true"/>
        <field name="irc"  type="string" indexed="true"  stored="true"/>
        <field name="twitter" type="string" indexed="true" stored="true"/>
        <field name="all" type="string" docValues="true" indexed="true" stored="false" multiValued="true"/>
      </fields>
      <defaultSearchField>name</defaultSearchField>
      <uniqueKey>id</uniqueKey>
      <copyField source="id" dest="all"/>
      <copyField source="email" dest="all"/>
      <copyField source="skype" dest="all"/>
      <copyField source="irc" dest="all"/>
      <copyField source="twitter" dest="all"/>
    </schema>
  5. On the command line in the solr_tutorial46 directory, upload the schema.xml and solrconfig.xml to Solr. Create the Solr core for the keyspace and table, user_info.users.
    $ curl http://localhost:8983/solr/resource/user_info.users/solrconfig.xml
      --data-binary @solrconfig.xml -H 'Content-type:text/xml; charset=utf-8'
    
    $ curl http://localhost:8983/solr/resource/user_info.users/schema.xml
      --data-binary @schema.xml -H 'Content-type:text/xml; charset=utf-8'
    
    $ curl "http://localhost:8983/solr/admin/cores?action=CREATE&name=user_info.users"
  6. In a browser, search Solr to identify the user, alias, and id of users having an alias smitty.
    http://localhost:8983/solr/user_info.users/select?q=all%3Asmitty&wt=xml&indent=true
    The output is:
    <result name="response" numFound="1" start="0">
     <doc>
       <str name="id">user1</str>
       <str name="twitter">@johnsmith</str>
       <str name="email">jsmith@abc.com</str>
       <str name="irc">smitty</str>
       <str name="name">john smith</str>
       <str name="skype">johnsmith</str>
     </doc>
    </result>
  7. Run this query:
    http://localhost:8983/solr/user_info.users/select/?q=*:*&facet=true&facet.field=name&facet.mincount=1&indent=yes
    At the bottom of the output, the facet results appear. Three instances of john smith, two instances of dan graham, and one instance of elizabeth doe:
    . . .
    </result>
    <lst name="facet_counts">
      <lst name="facet_queries"/>
      <lst name="facet_fields">
        <lst name="name">
          <int name="john smith">3</int>
          <int name="dan graham">2</int>
          <int name="elizabeth doe">1</int>
        </lst>
      </lst>
      . . .
  8. Now you can view the status of the field cache memory to see the RAM usage of docValues per Solr field. Results look something like the example shown in Example 2.
  9. In the Solr Admin, after selecting a Solr core from the drop-down menu, click Plugins / Stats. Expand dseFieldCache and dseFilterCache to see information about the per-segment field cache and filter cache.

    Choose Watch Changes or Refresh Values to get updated information.