Running Hive

With DSE Hadoop, run Hive as a server or as a client. DataStax Enterprise supports Apache HiveServer and Apache HiveServer2. HiveServer is an optional service for remote clients to submit programmatic requests to Hive.

You can run Hive as a server or as a client. DataStax Enterprise supports Apache HiveServer and Apache HiveServer2. HiveServer is an optional service for remote clients to submit programmatic requests to Hive. HiveServer2 is an improved version of HiveServer that supports multi-client concurrency and other features. You can use the Beeline command shell with HiveServer2.

Use a Hive client on a node in the cluster under these conditions:
  • To connect to the Hive server running on another node
  • To use Hive in a single-node cluster

Start a Hive client

You can start a Hive client on any analytics node and run MapReduce queries directly on data already stored in Cassandra. You run Hive as a client to perform the examples in this document.

Procedure

  1. Start DataStax Enterprise as an analytics (Hadoop) node.
    • Installer-Services and Package installations:
      1. Enable Hadoop mode by setting this option in /etc/default/dse:
        HADOOP_ENABLED=1
      2. Use this command to start the service:
        $ sudo service dse start
    • Installer-No Services and Tarball installations:

      From the installation directory:

      $ bin/dse cassandra -t
  2. Start a Hive client.
    • Installer-Services and Package installations:
      $ dse hive
    • Installer-No Services and Tarball installations:
      $ install_location/bin/dse hive
    The hive prompt appears and you can now enter HiveQL shell commands.