Using the Hive count function (deprecated)

For the cassandra.consistency.level general property in Hive TBLPROPERTIES, set the consistency level to ALL before you issue a Hive SELECT expression that contains the count function.

Hadoop is deprecated for use with DataStax Enterprise. DSE Hadoop and BYOH (Bring Your Own Hadoop) are deprecated. Hive is also deprecated and will be removed when Hadoop is removed.

Using the Hive TBLPROPERTIES cassandra.consistency.level, set the consistency level to ALL before issuing a Hive SELECT expression containing the count function. Using ALL ensures that when you ping one node for a scan of all keys, the node is fully consistent with the rest of the cluster. Using a consistency level other than ALL can return resultsets having fewer rows than expected because replication has not finished propagating the rows to all nodes. A count that is higher than expected can occur because tombstones have not yet been propagated to all nodes.

To get accurate results from the count function using a consistency level other than ALL:
  • Repair all nodes.
  • Prevent new data from being added or deleted.