Running HiveQL queries using Spark SQL

Spark SQL supports queries that are written using HiveQL, a SQL-like language that produces queries that are converted to Spark jobs.

Spark SQL supports queries written using HiveQL, a SQL-like language that produces queries that are converted to Spark jobs. HiveQL is more mature and supports more complex queries than Spark SQL. To construct a HiveQL query, first create a new HiveContext instance, and then submit the queries by calling the sql method on the HiveContext instance.

See the Hive Language Manual for the full syntax of HiveQL.

Note: Creating indexes with DEFERRED REBUILD is not supported in Spark SQL.

Procedure

  1. Start the Spark shell.
    bin/dse spark
  2. Use the provided HiveContext instance sqlContext to create a new query in HiveQL by calling the sql method on the sqlContext object..
    val results = sqlContext.sql("SELECT * FROM my_keyspace.my_table")