Indexing map columns
Dynamically index CQL map columns or manually set the Solr field type for key.
DataStax Enterprise (DSE) Search indexes a CQL map column using a Solr dynamic field. Dynamic fields apply the field definition using a wildcard match on the name. In the search index schema, DSE sets the dynamic field name to the CQL column name with an asterisk appended. DSE parses the data from a map using the key name and Solr will index only the keys that have the column name as the prefix. Keys that do not have the column name as a prefix are ignored.
blist_
map
column definition
is:<dynamicField indexed="true" multiValued="false" name="blist_*" type="StrField"/>
blist_
as the prefix in the example are
indexed and the rest are ignored. Only blist_age
and
blist_nation
are indexed when the following data is
inserted:INSERT INTO cycling.birthday_list (cyclist_name, blist_) VALUES ('Allan DAVIS', { 'blist_age':'35', 'bday':'27/07/1980', 'blist_nation':'AUSTRALIA'});
All key-value pairs in CQL maps have the same data type. The map in the example above sets
all values to text (blist map<text,text>
). Because DSE Search loads the
data by mapping the key name to the Solr dynamic field name, you can customize field type
for each key.
Prerequisites
This section walks you through the process of customizing the search index for data that has the same three map keys in every record, blist_age, bday (birth date), and blist_nation where only blist_age and blist_nation are indexed.
- Create the cycling keyspace
- Add the birthday_list table and data
Procedure
-
Create an index that excludes the
blist_
map column:CREATE SEARCH INDEX ON cycling.birthday_list WITH COLUMNS blist_ {excluded:true};
-
View the active schema:
DESC ACTIVE SEARCH INDEX SCHEMA ON cycling.birthday_list ;
DSE sets CQLtext
to SolrStrField
type.<?xml version="1.0" encoding="UTF-8" standalone="no"?> <schema name="autoSolrSchema" version="1.5"> <types> <fieldType class="org.apache.solr.schema.StrField" name="StrField"/> </types> <fields> <field indexed="true" multiValued="false" name="cyclist_name" type="StrField"/> </fields> <uniqueKey>cyclist_name</uniqueKey> </schema>
To set
blist_age
to an integer, the type definition is also required. -
Define the
blist_age
type and configure a field definition:ALTER SEARCH INDEX SCHEMA ON cycling.birthday_list ADD types.fieldType[@class='org.apache.solr.schema.TrieIntField', @name='TrieIntField']; ALTER SEARCH INDEX SCHEMA ON cycling.birthday_list ADD fields.field[@indexed='true', @multiValued='false', @name='blist_age', @type='TrieIntField'];
-
Define the
blist_nation
field as a string type, which has a corresponding type definition of@type='StrField'
:ALTER SEARCH INDEX SCHEMA ON cycling.birthday_list ADD fields.field[@name='blist_nation', @indexed='true', @multiValued='false', @type='StrField'];
-
View the pending changes to the schema to ensure that the syntax is correct:
DESC PENDING SEARCH INDEX SCHEMA ON cycling.birthday_list ;
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <schema name="autoSolrSchema" version="1.5"> <types> <fieldType class="org.apache.solr.schema.StrField" name="StrField"/> <fieldType class="org.apache.solr.schema.TrieIntField" name="TrieIntField"/> </types> <fields> <field indexed="true" multiValued="false" name="cyclist_name" type="StrField"/> <field indexed="true" multiValued="false" name="blist_age" type="TrieIntField"/> <field indexed="true" multiValued="false" name="blist_nation" type="StrField"/> </fields> <uniqueKey>cyclist_name</uniqueKey> </schema>
-
Reload the index configuration and schema to push the changes live:
RELOAD SEARCH INDEX ON cycling.birthday_list;
-
Rebuild the index whenever fields are added:
REBUILD SEARCH INDEX ON cycling.birthday_list;
-
Use the map fields to filter queries:
- Limit by age
23:
SELECT * FROM cycling.birthday_list WHERE solr_query = 'blist_age:23';
cyclist_name | blist_ | solr_query ------------------+----------------------------------------------------------------------+------------ Claudio HEINEN | {'bday': '27/07/1992', 'blist_age': '23', 'blist_nation': 'GERMANY'} | null Laurence BOURQUE | {'bday': '27/07/1992', 'blist_age': '23', 'nation': 'CANADA'} | null (2 rows)
- Limit by nation GERMANY, which is case-sensitive because the type is
string:
SELECT * FROM cycling.birthday_list WHERE solr_query = 'blist_nation:GERMANY';
cyclist_name | blist_ | solr_query ----------------+----------------------------------------------------------------------+------------ Claudio HEINEN | {'bday': '27/07/1992', 'blist_age': '23', 'blist_nation': 'GERMANY'} | null (1 rows)
- Limit by age
23: