Quick Start for Bare Metal/VM installs
This document explains installation of Luna Streaming for Bare Metal/VM deployments with a Pulsar tarball.
The resulting Luna Streaming deployment includes:
-
Tiered Storage: Offload historical messages to more cost effective object storages such as AWS S3, Azure Blob, Google Cloud Storage, and HDFS.
-
Built-in Schema Registry: Guarantee messaging type safety on a per-topic basis without relying on any external facility.
-
Pulsar I/O connectors: Enables Pulsar to exchange data with external systems, either as sources or sinks.
-
Pulsar Function: Lightweight compute extensions of Pulsar brokers which enable real-time simple event processing within Pulsar.
-
Pulsar SQL: SQL-based interactive query for message data stored in Pulsar.
-
Pulsar Transactions: enables event streaming applications to consume, process, and produce messages in one atomic operation.
Requirements
-
A Linux server or VM
-
JDK 17
-
File System
DataStax recommends XFS, but ext4 will work -
For a single node install, a server with at least 8 CPU and 32 GB of memory is required
-
For a small high-availability server, 4 servers are required. The servers must be on the same network so they can communicate with each other.
-
Servers should have at least 50 GB in their root disk volume.
-
BookKeeper should use one volume device for the journal, and one volume device for the ledgers. The journal device should be 20GB. The ledger volume device should be sized to hold the expected amount of stored message data.
-
DataStax recommends a separate data disk volume for ZooKeeper.
-
Operating System Settings Disable Swap and set Linux Transparent Huge Pages (THP) to
madvice
. Check this setting withcat /sys/kernel/mm/transparent_hugepage/enabled
andcat /sys/kernel/mm/transparent_hugepage/defrag
Installation
-
Download the DataStax Luna Streaming tarball from the DataStax GitHub repo. There are three versions of Luna Streaming currently available:
Luna Streaming filename
Included components
lunastreaming-core-<version>-bin.tar.gz
Contains the core Pulsar modules: Zookeeper, Broker, BookKeeper, and function worker
lunastreaming-<version>-bin.tar.gz
Contains all components from
lunastreaming-core
as well as support for Pulsar SQLlunastreaming-all-<version>-bin.tar.gz
Contains all components from
lunastreaming
as well as the NAR files for all Pulsar I/O connectors and offloaders -
Untar the tarball and change directory into the resulting file.
tar xzvf lunastreaming-3.1.3.0-bin.tar.gz cd lunastreaming-3.1.3.0
-
Enter
ls -al
to view your Luna Streaming files:➜ lunastreaming-3.1.3.0 ls -al total 88 drwxr-xr-x@ 11 firstname.lastname staff 352 May 17 05:58 . drwx------+ 98 firstname.lastname staff 3136 May 24 14:15 .. -rw-r--r--@ 1 firstname.lastname staff 31209 Jan 22 2020 LICENSE -rw-r--r--@ 1 firstname.lastname staff 6612 Jan 22 2020 NOTICE -rw-r--r--@ 1 firstname.lastname staff 1269 Jan 22 2020 README drwxr-xr-x@ 12 firstname.lastname staff 384 Jan 22 2020 bin drwxr-xr-x@ 21 firstname.lastname staff 672 Jan 22 2020 conf drwxr-xr-x@ 6 firstname.lastname staff 192 May 17 05:58 examples drwxr-xr-x@ 5 firstname.lastname staff 160 May 17 05:58 instances drwxr-xr-x@ 277 firstname.lastname staff 8864 May 17 05:58 lib drwxr-xr-x@ 25 firstname.lastname staff 800 Jan 22 2020 licenses
You have successfully installed the DataStax Luna Streaming tarball.
Additional tooling
Once the DataStax Luna Streaming tarball is installed, you may want to add additional tooling to your server/VM deployment.
-
Pulsar Admin Console: Web-based UI that administrates Pulsar.
Download the latest version from the DataStax GitHub repo and follow the instructions here.Admin Console requires NodeJS 14 LTS and Nginx version 1.17.9+.
-
Pulsar Heartbeat: Monitors Pulsar cluster availability.
Download the latest version from the DataStax GitHub repo and follow the instructions here.
What’s next?
-
For initializing Pulsar components like BookKeeper and ZooKeeper, see the Pulsar documentation.
-
For installing optional built-in connectors or tiered storage included in
lunastreaming-all
, see the Pulsar documentation. -
For installation to existing Kubernetes environments or with a cloud provider, see Quick Start for Helm Chart installs.
-
For Ansible deployment, use the DataStax Ansible scripts provided at https://github.com/datastax/pulsar-ansible.