Using the Hive count function

For the cassandra.consistency.level general property in Hive TBLPROPERTIES, set the consistency level to ALL before you issue a Hive SELECT expression that contains the count function.

Using the Hive TBLPROPERTIES cassandra.consistency.level, set the consistency level to ALL before issuing a Hive SELECT expression containing the count function. Using ALL ensures that when you ping one node for a scan of all keys, the node is fully consistent with the rest of the cluster. Using a consistency level other than ALL can return resultsets having fewer rows than expected because replication has not finished propagating the rows to all nodes. A count that is higher than expected can occur because tombstones have not yet been propagated to all nodes.

To get accurate results from the count function using a consistency level other than ALL:
  • Repair all nodes.
  • Prevent new data from being added or deleted.