A common use case example imports multiple tables, which are joined in SQL, to
Cassandra.
A common use case is to import multiple tables, which are joined in SQL, to
Cassandra. This example shows how to import two tables from MySQL. In MySQL, you use
query joins to get famous quotations from one table and the author of the quotation
from another. For
example:
mysql> SELECT * FROM person INNER JOIN mysql_quotations ON person.id=mysql_quotations.speaker;
+-----+--------------------+---------+----+---------+--------------------------------------------------------------+
| id | name | title | id | speaker | quote |
+-----+--------------------+---------+----+---------+--------------------------------------------------------------+
| 123 | Christopher Morley | Life | 1 | 123 | Life is a foreign language; all men mispronounce it. |
| 123 | Christopher Morley | Life | 2 | 123 | There are three ingredients in the good life: learning . . . |
| 124 | Daniel Akst | Life | 3 | 124 | In matters of self-control as we shall see again and . . . |
| 124 | Daniel Akst | Life | 4 | 124 | We Have Met the Enemy: Self-Control in an Age of Exc. . . |
| 125 | Abraham Lincoln | Success | 5 | 125 | Always bear in mind that your own resolution to . . . |
| 125 | Abraham Lincoln | Success | 6 | 125 | Better to remain silent and be thought a fool than . . . |
| 126 | Albert Einstein | Success | 7 | 126 | If A is success in life, then A equals x plus y plus . . . |
+-----+--------------------+---------+----+---------+--------------------------------------------------------------+
7 rows in set (0.00 sec)
This example assumes an analytics node is running.
Procedure
To import SQL tables into CQL using a collection set for the
quotations:
-
Download the import_quotations.zip file.
-
Create the mysql_quotations and person tables in MySQL. You can copy/paste
commands from the downloaded file to produce these tables.
-
Create the famous_words keyspace and quotations table in cqlsh. You can
copy/paste the commands from the downloaded file.
cqlsh> CREATE KEYSPACE famous_words WITH REPLICATION =
{'class':'NetworkTopologyStrategy', 'Analytics':1};
cqlsh> USE famous_words;
cqlsh:famous_words> CREATE TABLE quotations (
id text PRIMARY KEY,
name text,
title text,
quotes set text
);
-
Insert the data from the downloaded file into the person and mysql_quotations
tables.
-
Create an import options file named import_persons.options having the following
contents.
cql-import
--table
person
--cassandra-keyspace
famous_words
--cassandra-table
quotations
--cassandra-column-mapping
id:id,name:name,title:title
--connect
jdbc:mysql://127.0.0.1/famous_words
--username
root
--password
root
--cassandra-host
127.0.0.1
-
Import the person table into the CQL table. On Linux, for example:
$ sudo bin/dse sqoop --options-file path/import_person.options
The MapReduce job runs and at then end, looks something like
this:
. . .
14/06/16 20:26:43 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 35.1743 seconds (0 bytes/sec)
14/06/16 20:26:43 INFO mapreduce.ImportJobBase: Retrieved 4 records.
-
Check that the CQL quotations table now contains the data from the MySQL person
table.
cqlsh:famous_words> SELECT * FROM quotations;
id | name | quotes | title
-----+--------------------+--------+---------
123 | Christopher Morley | null | Life
125 | Abraham Lincoln | null | Success
126 | Albert Einstein | null | Success
124 | Daniel Akst | null | Life
(4 rows)
-
Create another import options file to import the mysql_quotations table. Use a
free form query to import the quotations into the CQL table. The literal string
$CONDITIONS needs to appear in the WHERE clause of the query. Sqoop replaces the
string with its refined constraints. For example:
cql-import
--query
select person.id, person.name, person.title, mysql_quotations.quote from person INNER JOIN mysql_quotations ON person.id=mysql_quotations.speaker WHERE $CONDITIONS
--target-dir
/sqoop
--split-by
person.id
--cassandra-keyspace
famous_words
--cassandra-table
quotations
--cassandra-column-mapping
id:id,name:name,title:title,quotes:quote
--connect
jdbc:mysql://127.0.0.1/famous_words
--username
root
--password
root
--cassandra-host
127.0.0.1
>
-
Import the resultset of quotations into the CQL table. On Linux, assuming you
named the options file import_quotes.options:
$ sudo bin/dse sqoop --options-file path/import_quotes.options
-
Check that the CQL quotations table now contains a collection set of quotations
as well as the name of the speaker and other information in the MySQL
table.
cqlsh:famous_words> SELECT * FROM quotations;