Supported syntax of Spark SQL

The following syntax defines a SELECT query.

SELECT [DISTINCT] [column names]|[wildcard] 
FROM [keyspace name.]table name 
[JOIN clause table name ON join condition]
[WHERE condition]
[GROUP BY column name]
[HAVING conditions]
[ORDER BY column names [ASC | DSC]]

A SELECT query using joins has the following syntax.

SELECT statement
FROM statement
[JOIN | INNER JOIN | LEFT JOIN | LEFT SEMI JOIN | LEFT OUTER JOIN | RIGHT JOIN | RIGHT OUTER JOIN | FULL JOIN | FULL OUTER JOIN]
ON join condition

Several select clauses can be combined in a UNION, INTERSECT, or EXCEPT query.

SELECT statement 1
[UNION | UNION ALL | UNION DISTINCT | INTERSECT | EXCEPT]
SELECT statement 2

Note: Select queries run on new columns return '', or empty results, instead of None.

The following syntax defines an INSERT query.

INSERT [OVERWRITE] INTO [keyspace name.]table name [(columns)]
VALUES values

The following syntax defines a CACHE TABLE query.

CACHE TABLE table name [AS table alias]

You can remove a table from the cache using a UNCACHE TABLE query.

UNCACHE TABLE table name

Keywords in Spark SQL

The following keywords are reserved in Spark SQL.

ALL
AND
AS
ASC
APPROXIMATE
AVG
BETWEEN
BY
CACHE
CAST
COUNT
DESC
DISTINCT
FALSE
FIRST
LAST
FROM
FULL
GROUP
HAVING
IF
IN
INNER
INSERT
INTO
IS
JOIN
LEFT
LIMIT
MAX
MIN
NOT
NULL
ON
OR
OVERWRITE
LIKE
RLIKE
UPPER
LOWER
REGEXP
ORDER
OUTER
RIGHT
SELECT
SEMI
STRING
SUM
TABLE
TIMESTAMP
TRUE
UNCACHE
UNION
WHERE
INTERSECT
EXCEPT
SUBSTR
SUBSTRING
SQRT
ABS