CREATE SEARCH INDEX 

Defines a new search index for an existing table. .

Defines a new search index for an existing table. Automatically creates the search index schema and configuration, then generates an index.
Restriction: Command available only on DSE Search nodes. Running search index management commands on a large datasets may take longer than the CQLSH default timeout of 10 minutes, increase the CQLSH client timeout as required.
Important: This command runs with a consistency level of QUORUM.

Synopsis 

CREATE SEARCH INDEX [IF NOT EXISTS] ON [keyspace_name.]table_name
(WITH COLUMNS column_list { option : value } [, ...]
    | [AND] PROFILES profile_name [, profile_name, ...]
    | [AND] CONFIG { option:value [, ...] }
    | [AND] OPTIONS { option:value [,  ...] } )

If the CREATE SEARCH INDEX statement specifies no options all columns are indexed using the default values.

Legend
Syntax conventions Description
UPPERCASE Literal keyword.
Lowercase Not literal.
Italics Variable value. Replace with a user-defined value.
[] Optional. Square brackets ( [] ) surround optional command arguments. Do not type the square brackets.
( ) Group. Parentheses ( ( ) ) identify a group to choose from. Do not type the parentheses.
| Or. A vertical bar ( | ) separates alternative elements. Type any one of the elements. Do not type the vertical bar.
... Repeatable. An ellipsis ( ... ) indicates that you can repeat the syntax element as often as required.
'Literal string' Single quotation ( ' ) marks must surround literal strings in CQL statements. Use single quotation marks to preserve upper case.
{ key : value } Map collection. Braces ( { } ) enclose map collections or key value pairs. A colon separates the key and the value.
<datatype1,datatype2> Set, list, map, or tuple. Angle brackets ( < > ) enclose data types in a set, list, map, or tuple. Separate the data types with a comma.
cql_statement; End CQL statement. A semicolon ( ; ) terminates all CQL statements.
[--] Separate the command line options from the command arguments with two hyphens ( -- ). This syntax is useful when arguments might be mistaken for command line options.
' <schema> ... </schema> ' Search CQL only: Single quotation marks ( ' ) surround an entire XML schema declaration.
@xml_entity='xml_entity_type' Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrConfig files.

EBNF 

EBNF syntax:
createSearchIndex ::= 'CREATE' 'SEARCH' 'INDEX' ('IF' 'NOT' 'EXISTS')?
                      'ON' tableName
                      ('WITH' indexOptions)?
tableName         ::= (keyspace '.')? table
indexOptions      ::= indexOption ('AND' indexOption)*
indexOption       ::= 'COLUMNS' columnList
                      | 'PROFILES' profileName (',' profileName)*
                      | 'CONFIG' optionMap
                      | 'OPTIONS' optionMap
columnList        ::= column (',' column)*
column            ::= (columnName | '*')('{' optionMap '}')?
optionMap         ::= '{' (optionName ':' optionValue (',' optionName ':'
                      optionValue)*)? '}'

Railroad diagram:

createSearchIndex
Image shows a railroad diagram for the CREATE SEARCH INDEX search CQL command

tableName
Image shows a railroad diagram of tableName

indexOptions
Image shows a railroad diagram of indexOptions

indexOption
Image shows a railroad diagram of indexOption

columnList
Image shows a railroad diagram of columnList

column
Image shows a railroad diagram of column

optionMap
Image shows a railroad diagram of optionMap

COLUMNS 

Defines which fields to include in index, sets index type, and creates non-tokenized fields for faceted search.
COLUMNS column_list { copyField : true | false }, 
column_list { docValues : true | false }, 
column_list { excluded : true | false }, 
Note: When this option is used, any column no listed is excluded from the index by default except PRIMARY KEY columns which must be indexed.
column_list
Specify a comma separated list of column names or an asterisk (*) to select all columns.
copyField: (true | false)
Set to true to create a new field copied from the specified columns with type StrField. Duplicates the data from the original field into the new field. Use for columns that require both search and faceting.

Default value is false.

docValues: ( true | false)
Creates a forward index on each specified column. Setting is only valid on Solr types that extend TrieField, UUIDField, and StrField. Use on columns that are sorted or grouped (faceted). Default is true for TrieField and UUIDField types and false StrField types.

Due to SOLR-7264, setting docValues to true on a boolean field in the Solr schema does not work. A workaround for boolean docValues is to use 0 and 1 with a TrieIntField.

Note: Using spaceSavings profiles disables auto generation of DocValues.
excluded: (true | false)
True excludes the listed columns from the index and false includes the columns in the index.

When using the COLUMNS option, set excluded to false for the columns you want to include in the index.

PROFILES 

Apply space saving options to minimize index size on initial creation. Specify spaceSavingAll or a comma separated list of profiles to apply.
PROFILES profile_name [, profile_name, ...]
Note: Profiles only apply to the initial index generation, and do not apply to the ALTER SEARCH INDEX SCHEMA command.
spaceSavingAll
Applies all profiles.
spaceSavingNoJoin
Excludes the hidden primary key required for joins across tables on search queries from the index. When used table joins on search index queries are not allowed.
spaceSavingNoTextfield
Indexes text fields as untokenized string fields.
spaceSavingSlowTriePrecision
Sets trie fields precisionStep to '0', allowing for greater space saving but slower querying.

CONFIG 

Configuration options override values in the config file. The CONFIG option map can pass options with this syntax:
CONFIG { shortcut_name:value [, shortcut_name:value, ...] }
shortcuts
Shortcuts to configuration element values using SET:
  • autoCommitTime Default value is 10000
  • defaultQueryField Name of the field. Default not set.
    Note: Use SET to add. Use DROP to remove.
  • directoryFactory Specify exclusive of directoryFactoryClass. The options are:
    • StandardDirectoryFactory
    • EncryptedFSDirectoryFactory
  • filterCacheLowWaterMark Default is 128.
  • filterCacheHighWaterMark Default is 256.
  • mergeFactor Default is 10.
  • mergeMaxThreadCount Must configure with mergeMaxMergeCount. Default not set.
  • mergeMaxMergeCount Must configure with mergeMaxThreadCount. Default not set.
  • ramBufferSize Default is 512.
  • realtime Default is false

OPTIONS 

Request options configure the entire request. The OPTIONS map can pass options with this syntax:
OPTIONS { option:value [, option:value, ...] }
The request options are boolean values:
recovery
  • true - if the search core is unable to load due to corrupted index, recovers it by deleting and recreating the index. The deleteAll flag is set based on the recovery flag unless deleteAll is specifically set.
  • false - no recovery. Default.
reindex
  • true - reindexes the data. Keeps the current index (accepting reads) while the new index is building. Default.
  • false - does not reindex the data.
lenient
  • true - The SchemaGenerator silently ignores fields of type SpatialRecursivePrefixTreeFieldType (CQL PointType).
  • false - Do not ignore fields of type SpatialRecursivePrefixTreeFieldType (CQL PointType).

Examples 

The search index is created with the wiki.solr keyspace and table, and the specified options.

Create search index if it does not exist
CREATE SEARCH INDEX IF NOT EXISTS ON wiki.solr WITH COLUMNS id, body {excluded : false};
Create real time (RT) search index, but don't reindex the data
CREATE SEARCH INDEX ON wiki.solr WITH CONFIG { realtime:true } AND OPTIONS { reindex : false };
Create search index with docValues set for all columns with a supported type
CREATE SEARCH INDEX ON wiki.solr WITH COLUMNS * { docValues:true };