CQL pushdown filter
Optimize the processing of the data by moving filtering expressions in Pig as close to the data source as possible.
DataStax Enterprise includes a CqlStorage URL option, use_secondary. Setting the option to
true optimizes the processing of the data by moving filtering expressions in Pig as close to
the data source as possible. To use this capability:
-
Create an index for the Cassandra table.
For Pig pushdown filtering, the secondary index must have the same name as the column being indexed.
-
Include the use_secondary option with a value of true in the url format for the storage handler. The option name reflects the term that used to be used for a Cassandra index: secondary index. For example:
newdata = LOAD 'cql://ks/cf_300000_keys_50_cols?use_secondary=true' USING CqlNativeStorage();