Quick Start for Bare Metal/VM installs

This document explains installation of Luna Streaming for Bare Metal/VM deployments with a Pulsar tarball.

The resulting Luna Streaming deployment includes:

  • Tiered Storage: Offload historical messages to more cost effective object storages such as AWS S3, Azure Blob, Google Cloud Storage, and HDFS.

  • Built-in Schema Registry: Guarantee messaging type safety on a per-topic basis without relying on any external facility.

  • Pulsar I/O connectors: Enables Pulsar to exchange data with external systems, either as sources or sinks.

  • Pulsar Function: Lightweight compute extensions of Pulsar brokers which enable real-time simple event processing within Pulsar.

  • Pulsar SQL: SQL-based interactive query for message data stored in Pulsar.

  • Pulsar Transactions: enables event streaming applications to consume, process, and produce messages in one atomic operation.

Requirements

  • A Linux server or VM

  • JDK 11
    Pulsar can run with JDK8, but DataStax Luna Streaming is designed for Java 11. Java 17 LTS will be supported in the future.

  • File System
    DataStax recommends XFS, but ext4 will work

  • For a single node install, a server with at least 8 CPU and 32 GB of memory is required

  • For a small high-availability server, 4 servers are required. The servers must be on the same network so they can communicate with each other.

  • Servers should have at least 50 GB in their root disk volume.

  • BookKeeper should use one volume device for the journal, and one volume device for the ledgers. The journal device should be 20GB. The ledger volume device should be sized to hold the expected amount of stored message data.

  • DataStax recommends a separate data disk volume for ZooKeeper.

  • Operating System Settings Disable Swap and set Linux Transparent Huge Pages (THP) to madvice. Check this setting with cat /sys/kernel/mm/transparent_hugepage/enabled and cat /sys/kernel/mm/transparent_hugepage/defrag

Installation

  1. Download the DataStax Luna Streaming tarball from the DataStax GitHub repo. There are three versions of Luna Streaming currently available:

    Luna Streaming filename

    Included components

    lunastreaming-core-<version>-bin.tar.gz

    Contains the core Pulsar modules: Zookeeper, Broker, BookKeeper, and function worker

    lunastreaming-<version>-bin.tar.gz

    Contains all components from lunastreaming-core as well as support for Pulsar SQL

    lunastreaming-all-<version>-bin.tar.gz

    Contains all components from lunastreaming as well as the NAR files for all Pulsar I/O connectors and offloaders

  2. Untar the tarball and change directory into the resulting file.

    tar xzvf lunastreaming-2.10.0-bin.tar.gz
    cd lunastreaming-2.10.0
  3. Enter ls -al to view your Luna Streaming files:

    ➜  lunastreaming-2.10.0.3 ls -al
    total 88
    drwxr-xr-x@  11 firstname.lastname  staff    352 May 17 05:58 .
    drwx------+  98 firstname.lastname  staff   3136 May 24 14:15 ..
    -rw-r--r--@   1 firstname.lastname  staff  31209 Jan 22  2020 LICENSE
    -rw-r--r--@   1 firstname.lastname  staff   6612 Jan 22  2020 NOTICE
    -rw-r--r--@   1 firstname.lastname  staff   1269 Jan 22  2020 README
    drwxr-xr-x@  12 firstname.lastname  staff    384 Jan 22  2020 bin
    drwxr-xr-x@  21 firstname.lastname  staff    672 Jan 22  2020 conf
    drwxr-xr-x@   6 firstname.lastname  staff    192 May 17 05:58 examples
    drwxr-xr-x@   5 firstname.lastname  staff    160 May 17 05:58 instances
    drwxr-xr-x@ 277 firstname.lastname  staff   8864 May 17 05:58 lib
    drwxr-xr-x@  25 firstname.lastname  staff    800 Jan 22  2020 licenses

You have successfully installed the DataStax Luna Streaming tarball.

Additional tooling

Once the DataStax Luna Streaming tarball is installed, you may want to add additional tooling to your server/VM deployment.

  • Pulsar Admin Console: Web-based UI that administrates Pulsar.
    Download the latest version from the DataStax GitHub repo and follow the instructions here.

    Admin Console requires NodeJS 14 LTS and Nginx version 1.17.9+.

  • Pulsar Heartbeat: Monitors Pulsar cluster availability.
    Download the latest version from the DataStax GitHub repo and follow the instructions here.

What’s next?