CQL data access
Use the CqlNativeStorage handler with the input_cql statement or use the output_query statement that was available in earlier releases.
In DataStax Enterprise 4.0.4, to access data in CQL tables, use the CqlNativeStorage handler with the new input_cql statement or use the output_query statement that was available in earlier releases.
In DataStax Enterprise 4.0-4.0.3, to access data in CQL tables, use the CqlStorage() handler. To access data in the CassandraFS, the target keyspace and table must already exist. Data in a Pig relation can be stored in a Cassandra table, but Pig will not create the table.
- DataStax Enterprise 4.0.4
<pig_relation_name> = LOAD 'cql://<keyspace>/<table>' USING CqlNativeStorage(); -- DataStax Enterprise 4.0.4
- DataStax Enterprise 4.0 - 4.0.3
<pig_relation_name> = LOAD 'cql://<keyspace>/<table>' USING CqlStorage(); -- DataStax Enterprise 4.0 - 4.0.3
- int
- long
- float
- double
- boolean
- chararray
LOAD schema
The LOAD Schema is:
(colname:colvalue, colname:colvalue, … )
where each colvalue is referenced by the Cassandra column name.
Accessing data using input_cql and CqlNativeStorage handler
- A SELECT statement that includes the partition key columns
- A WHERE clause that includes the range of the columns consistent with the order in the
cluster and in the following
format:
WHERE token(partitionkey) > ? and token(partitionkey) <?
- The value of the native_port
For example, the input_cql statement before encoding might look like this:
'SELECT * FROM ks.tab where token(key) > ? and token (key) <= ?' USING CqlNativeStorage();
x = LOAD 'cql://ks/tab?input_cql=SELECT%20*%20FROM%20ks.tab%20where%20token(key)%20%3E%20%3F%20and%20token%20(key)%20%3C%3D%20%3F' USING CqlNativeStorage();
&native_port=9042
The entire migrated Pig command would look like this:
x = LOAD 'cql://ks/tab?input_cql=SELECT%20*%20FROM%20ks.tab%20where%20token(key)%20%3E%20%3F%20and%20token%20(key)%20%3C%3D%20%3F&native_port=9042' USING CqlNativeStorage();
Optional input_cql parameters
- &native_port=<native_port>
- &core_conns=<core_conns>
- &max_conns=<max_conns>
- &min_simult_reqs=<min_simult_reqs>
- &max_simult_reqs=<max_simult_reqs>
- &native_timeout=<native_timeout>
- &native_read_timeout=<native_read_timeout>
- &rec_buff_size=<rec_buff_size>
- &send_buff_size=<send_buff_size>
- &solinger=<solinger>
- &tcp_nodelay=<tcp_nodelay>
- &reuse_address=<reuse_address>
- &keep_alive=<keep_alive>
- &auth_provider=<auth_provider>
- &trust_store_path=<trust_store_path>
- &key_store_path=<key_store_path>
- &trust_store_password=<trust_store_password>
- &key_store_password=<key_store_password>
- &cipher_suites=<cipher_suites>
- &input_cql=<input_cql>
Handling special characters in the CQL
If the input_cql or output_query to a Pig function contains special characters, you need to url-encode a prepared statement to make special characters readable by Pig.