Indexing map columns
DataStax Enterprise (DSE) Search indexes a CQL map column using a Solr dynamic field. Dynamic fields apply the field definition using a wildcard match on the name. In the search index schema, DSE sets the dynamic field name to the CQL column name with an asterisk appended. DSE parses the data from a map using the key name and Solr will index only the keys that have the column name as the prefix. Keys that do not have the column name as a prefix are ignored.
For example, a search index with default settings on a map column named blist_
has the following definition:
<dynamicField indexed="true" multiValued="false" name="blist_*" type="StrField"/>
When DSE builds the index from the CQL rows, the key name is used, not the column name.
Therefore, all keys that have the blist_
as the prefix in the example are indexed and the rest are ignored.
Only blist_age
and blist_nation
are indexed when the following data is inserted:
INSERT INTO cycling.birthday_list (
cyclist_name, blist
) VALUES (
'Allan DAVIS', { 'age':'35', 'bday':'27/07/1980', 'nation':'AUSTRALIA' }
);
All key-value pairs in CQL maps have the same data type.
The map in the example above sets all values to text (blist map<text,text>
).
Because DSE Search loads the data by mapping the key name to the Solr dynamic field name, you can customize field type for each key.
Create a Search index on a map columns
This example creates a customized Search index for data that has the same three map keys in every record, blist_age
, bday
, and blist_nation
.
Only blist_age
and blist_nation
are indexed.
-
Create a keyspace named
cycling
:CREATE KEYSPACE IF NOT EXISTS cycling WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
-
Create a table called
birthday
in thecycling
keyspace:CREATE TABLE IF NOT EXISTS cycling.birthday_list ( cyclist_name text PRIMARY KEY, blist map<text, text> );
-
Create an index that excludes the
blist_
map column:CREATE SEARCH INDEX IF NOT EXISTS ON cycling.birthday_list WITH COLUMNS blist { excluded:true };
-
View the active schema:
DESC ACTIVE SEARCH INDEX SCHEMA ON cycling.birthday_list;
DSE sets CQL
text
to SolrStrField
type.<?xml version="1.0" encoding="UTF-8" standalone="no"?> <schema name="autoSolrSchema" version="1.5"> <types> <fieldType class="org.apache.solr.schema.StrField" name="StrField"/> </types> <fields> <field indexed="true" multiValued="false" name="cyclist_name" type="StrField"/> </fields> <uniqueKey>cyclist_name</uniqueKey> </schema>
-
To set
blist_age
to an integer, define theblist_age
type and configure a field definition:ALTER SEARCH INDEX SCHEMA ON cycling.birthday_list ADD fields.field[ @indexed='true', @multiValued='false', @name='blist_age', @type='TrieIntField' ];
-
Define the
blist_nation
field as a string type, which has a corresponding type definition of@type='StrField'
:ALTER SEARCH INDEX SCHEMA ON cycling.birthday_list ADD fields.field[ @name='blist_nation', @indexed='true', @multiValued='false', @type='StrField' ];
-
View the pending changes to the schema to ensure that the syntax is correct:
DESC PENDING SEARCH INDEX SCHEMA ON cycling.birthday_list;
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <schema name="autoSolrSchema" version="1.5"> <types> <fieldType class="org.apache.solr.schema.StrField" name="StrField"/> <fieldType class="org.apache.solr.schema.TrieIntField" name="TrieIntField"/> </types> <fields> <field indexed="true" multiValued="false" name="cyclist_name" type="StrField"/> <field indexed="true" multiValued="false" name="blist_age" type="TrieIntField"/> <field indexed="true" multiValued="false" name="blist_nation" type="StrField"/> </fields> <uniqueKey>cyclist_name</uniqueKey> </schema>
-
Reload the index configuration and schema to push the changes live:
RELOAD SEARCH INDEX ON cycling.birthday_list;
-
Rebuild the index whenever fields are added:
REBUILD SEARCH INDEX ON cycling.birthday_list;
-
Use the map fields to filter queries:
-
Limit by age 23:
SELECT * FROM cycling.birthday_list WHERE solr_query = 'blist_age:23';
cyclist_name | blist ------------------+---------------------------------------------------------- Claudio HEINEN | {'age': '23', 'bday': '27/07/1992', 'nation': 'GERMANY'} Laurence BOURQUE | {'age': '23', 'bday': '27/07/1992', 'nation': 'CANADA'} (2 rows)
-
Limit by nation GERMANY, which is case-sensitive because the type is a string:
SELECT * FROM cycling.birthday_list WHERE solr_query = 'blist_nation:GERMANY';
-