Pig demo examples set up a prepared CQL query using the output_query
statement.
Hadoop is deprecated for use with DataStax Enterprise. DSE
Hadoop and BYOH (Bring Your Own Hadoop) are deprecated. Pig is also deprecated
and will be removed when Hadoop is removed.
The Pig demo examples show the steps required for setting up a prepared CQL query
using the output_query statement:
Procedure
-
Format the data
The example of saving Pig relations from/to
Cassandra shows the output schema: the name of the simple_table1
table primary key 'a', represented as a chararray in the relation is paired
with a value in the simple_table2 table. In this case, the key for
simple_table1 table is only a partitioning key, and only a single tuple is
needed.
The Pig statement to add (moredata) fields to a tuple is:
grunt> insertformat= FOREACH morevalues GENERATE
TOTUPLE(TOTUPLE('a',x)),TOTUPLE(y);
The example of exploring library data works
with more complicated data, a partition key and clustering column:
grunt> insertformat = FOREACH moredata GENERATE
TOTUPLE(TOTUPLE('a',x),TOTUPLE('b',y),TOTUPLE('c',z)),TOTUPLE(data);
-
Construct the prepared query
The output query portion of the cql:// URL is the prepared statement. The
prepared statement must be url-encoded to make special
characters readable by Pig.
The example of saving Pig relations from/to Cassandra shows how to construct
a prepared query:
'cql://cql3ks/simple_table1?output_query=UPDATE+cql3ks.simple_table1+set+b+%3D+%3F'
The key values of the simple_table1 table are automatically transformed into
the 'WHERE (key) =' clause to form the output_query portion of a prepared
statement.
-
Execute the query
To update the simple_table1 table using the values in the simple_table2
(4-6), the prepared statement is executed using these WHERE clauses when the
MapReduce job runs:
... WHERE a = 5
... WHERE a = 4
... WHERE a = 6
This output_query in Pig statement forms the '...' url-encoded portion of the
prepared statement:
grunt> STORE insertformat INTO
'cql://cql3ks/simple_table1?output_query=UPDATE+cql3ks.simple_table1+set+b+%3D+%3F'
USING CqlNativeStorage;
Decoded the UPDATE statement is:
UPDATE cql3ks.simple_table1 SET b = ?
The prepared statement represents these queries:
UPDATE cql3ks.test SET b = 5 WHERE a = 5;
UPDATE cql3ks.test set b = 4 WHERE a = 4;
UPDATE cql3ks.test set b = 6 WHERE a = 6;