Creating a URL-encoded prepared statement (deprecated)

Pig demo examples set up a prepared CQL query using the output_query statement.

Hadoop is deprecated for use with DataStax Enterprise. DSE Hadoop and BYOH (Bring Your Own Hadoop) are deprecated. Pig is also deprecated and will be removed when Hadoop is removed.

The Pig demo examples show the steps required for setting up a prepared CQL query using the output_query statement:

Procedure

  1. Format the data

    The example of saving Pig relations from/to Cassandra shows the output schema: the name of the simple_table1 table primary key 'a', represented as a chararray in the relation is paired with a value in the simple_table2 table. In this case, the key for simple_table1 table is only a partitioning key, and only a single tuple is needed.

    The Pig statement to add (moredata) fields to a tuple is:

    grunt> insertformat= FOREACH morevalues GENERATE
             TOTUPLE(TOTUPLE('a',x)),TOTUPLE(y);

    The example of exploring library data works with more complicated data, a partition key and clustering column:

    grunt> insertformat = FOREACH moredata GENERATE
             TOTUPLE(TOTUPLE('a',x),TOTUPLE('b',y),TOTUPLE('c',z)),TOTUPLE(data);
  2. Construct the prepared query

    The output query portion of the cql:// URL is the prepared statement. The prepared statement must be url-encoded to make special characters readable by Pig.

    The example of saving Pig relations from/to Cassandra shows how to construct a prepared query:

    'cql://cql3ks/simple_table1?output_query=UPDATE+cql3ks.simple_table1+set+b+%3D+%3F'

    The key values of the simple_table1 table are automatically transformed into the 'WHERE (key) =' clause to form the output_query portion of a prepared statement.

  3. Execute the query

    To update the simple_table1 table using the values in the simple_table2 (4-6), the prepared statement is executed using these WHERE clauses when the MapReduce job runs:

    ... WHERE a = 5
    ... WHERE a = 4
    ... WHERE a = 6

    This output_query in Pig statement forms the '...' url-encoded portion of the prepared statement:

    grunt> STORE insertformat INTO
             'cql://cql3ks/simple_table1?output_query=UPDATE+cql3ks.simple_table1+set+b+%3D+%3F'
             USING CqlNativeStorage;

    Decoded the UPDATE statement is:

    UPDATE cql3ks.simple_table1 SET b = ?

    The prepared statement represents these queries:

    UPDATE cql3ks.test SET b = 5 WHERE a = 5;
    UPDATE cql3ks.test set b = 4 WHERE a = 4;
    UPDATE cql3ks.test set b = 6 WHERE a = 6;