Inserting data into tables with static columns using Apache Spark™ SQL

Static columns are mapped to different columns in Spark SQL and require special handling. Spark SQL Thrift servers use Hive. When you when run an insert query, you must pass data to those columns.

To work around the different columns, set cql3.output.query in the insertion Hive table properties to limit the columns that are being inserted. In Spark SQL, alter the external table to configure the prepared statement as the value of the Hive CQL output query. For example, this prepared statement takes values that are inserted into columns a and b in mytable and maps these values to columns b and a, respectively, for insertion into the new row.

spark-sql> ALTER TABLE mytable SET TBLPROPERTIES ('cql3.output.query' = 'update
      mykeyspace.mytable set b = ? where a = ?');
spark-sql> ALTER TABLE mytable SET SERDEPROPERTIES ('cql3.update.columns' =
    'b,a');

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com