Luna Streaming FAQs

If you are new to DataStax Luna Streaming and its Apache Pulsar enhancements, these FAQs are for you.

Introduction

What is DataStax Luna Streaming?

DataStax Luna Streaming is a new Kubernetes-based distribution of Apache Pulsar, based on the technology that Kesque built to run its Pulsar-as-a-service.

What components and features are provided by DataStax Luna Streaming?

In addition to Apache Pulsar itself, DataStax Luna Streaming provides:

  • An installer that can stand up a dev or production cluster on bare metal or VMs without a pre-existing Kubernetes environment

  • A helm chart that can deploy and manage Pulsar on your current Kubernetes infrastructure

  • Cassandra, Elastic, Kinesis, Kafka, and JDBC connectors

  • A management dashboard

  • A monitoring and alerting system

On which version of Apache Pulsar is DataStax Luna Streaming based?

DataStax Luna Streaming is based on Apache Pulsar 2.6.2, and features additional enhancements from DataStax contributors.

What does DataStax Luna Streaming provide that I cannot get with open-source Apache Pulsar?

DataStax Luna Streaming is a hardened version of Apache Pulsar that been run through additional testing to ensure it is ready for production use. It also includes additional tooling to help monitor your system, including an enhanced Admin Console and a Heartbeat service to monitor the system health.

Is DataStax Luna Streaming an open-source project?

Yes, DataStax Luna Streaming is open source. See the repos FAQ.

Which Kubernetes platforms are supported by DataStax Luna Streaming?

They include Minikube, K8d, Kind, Google Kubernetes Engine (GKE), Microsoft Azure Kubernetes Service, Amazon Kubernetes Service (AKS), and other commonly used platforms.

Where are the DataStax Luna Streaming public GitHub repos?

There are several public repos, each with a different purpose. See:

Installation

Is there a prerequisite version of Java needed for the DataStax Luna Streaming installation?

The DataStax Luna Streaming distribution is designed for Java 11. However, because the product releases Docker images, you do not need to install Java (8 or 11) in advance. Java 11 is bundled in the Docker image.

What are the install options for DataStax Luna Streaming?

  • Use the provided Helm Chart to install DataStax Luna Streaming in an existing Kubernetes cluster on your laptop or hosted by a cloud provider.

  • Use the provided replicated.io packaging to install on a single or multiple server(s) or VM(s).

How do I install DataStax Luna Streaming in my Kubernetes cluster?

DataStax Luna Streaming provides a Helm Chart (version 3) to add the datastax-pulsar repo and install DataStax Luna Streaming on your laptop or cloud-provided Kubernetes cluster. Example:

helm repo add datastax-pulsar https://datastax.github.io/pulsar-helm-chart
curl -LOs https://datastax.github.io/pulsar-helm-chart/examples/dev-values.yaml
helm install pulsar -f dev-values.yaml datastax-pulsar/pulsar

You can set additional options in the values file, but the commands above represent the quick start approach. For more, see Quick Start for Helm Chart installs.

How do I install DataStax Luna Streaming on my server or VM?

DataStax Luna Streaming provide a pre-configured Replicated kURL package to install Luna Streaming on a single server/VM or multiple (4) servers/VMs. Example initial setup commands:

curl -sSL https://k8s.kurl.sh/luna-streaming | sudo bash
curl -s https://downloads.datastax.com/license/luna-streaming/Community.yaml -o Community.yaml
bash -l
kubectl kots install luna-streaming --namespace default --license-file ./Community.yaml

Then follow the full instructions in Quick Start for Server/VM installs.

Tools and UIs

What is Replicated?

Replicated provides a container-based platform to deploy cloud-native applications inside your on-prem environment, which gives you greater security and control. DataStax Luna Streaming uses Replicated to install on a large server/VM (minimum 8 CPUs, 32 GB RAM), or on a small HA comprised of 4 nodes, each with the same minimum 8 CPUs and 32 GB RAM requirement. By design, the details of the Replicated packaging and its configuration are abstracted away by DataStax Luna Streaming, so that instead you can get started quickly and focus on building pub/sub apps with this enhanced distribution of Pulsar.

What are the system requirements for the server/VM installs that use Replicated KOTS?

See the system requirements in the kURL documentation, which is used with Replicated Kubernetes Off The Shelf (KOTS).

How do I start the DataStax Luna Streaming Admin Console?

See the following topics:

  • If you installed Luna Streaming via the Helm Chart for a Kubernetes environment residing on your laptop or with a cloud provider, see Quick Start for Helm Chart installs.

  • If you installed Luna Streaming via the Replicated package on a large server/VM or in a small HA (4 node cluster), see Quick Start for Server/VM installs. First set up the environment via the Installation Console; its URL and password for your environment were displayed in the output from Step 1, as covered in that topic. Then when the application has been deployed and is running, open the node’s host DNS name in a browser. To login to the Luna Streaming Admin Console, by default the username is admin and you can find the generated password on the Intallation Console’s Config tab.

What task can I perform in the DataStax Luna Streaming Installation Console?

From the Installation Console, you can:

  • Configure the DataStax Luna Streaming environment

  • View and update the DataStax Luna Streaming version and application status

  • Get the host URL and generated password to launch the DataStax Luna Streaming Admin Console

  • Set up and view Grafana graphs, which display metrics collected by the provided Prometheus Operator

  • Analyze DataStax Luna Streaming to collect information you could provide for Support

  • Sync the license that enables DataStax Luna Streaming software

  • View upstream, midstream, and downstream files used by DataStax Luna Streaming

What task can I perform in the DataStax Luna Streaming Admin Console?

From the Admin Console, you can:

  • Add and run Pulsar clients

  • Establish credentials for secure connections

  • Define topics that can be published for streaming apps

  • Set up Pulsar sinks that publish topics and make them available to subscribers, such as for a Cassandra database table

  • Control namespaces used by Pulsar

  • Use the Admin API

What is Pulsar Heartbeat?

Pulsar Heartbeat monitors the availability, tracks the performance, and reports failures of the Pulsar cluster. It produces synthetic workloads to measure end-to-end message pubsub latency. Pulsar Heartbeat is a cloud-native application that can be installed by Helm within the Pulsar Kubernetes cluster.

What is Prometheus?

Prometheus is an open-source tool to collect metrics on a running app, providing real-time monitoring and alerts.

What is Grafana?

Grafana is a visualization tool that helps you make sense of metrics and related data coming from your apps via Prometheus, for example.

Pulsar Connector

What are the features provided by DataStax Apache Pulsar Connector (pulsar-sink) that are not supported in kafka-sink?

  • Single record acknowledgement and negative acknowledgements.

  • The Pulsar IO framework provides many features that are not possible in Kafka, and has different compression formats and auth/security features. The features are handled by Pulsar.

What features are missing in DataStax Apache Pulsar Connector (pulsar-sink) compared with kafka-sink?

  • No support for tinyint (int8bit) and smallint (int16bit).

  • The key is always a String, but you can write JSON inside it; the support is implemented in pulsar-sink, but not in Pulsar IO.

  • The “value” of a “message property” is always a String; for example, you cannot map the message property to ttl or to timestamp.

  • Field names inside structures must be valid for Avro, even in case of JSON structures. For example, field names like Int.field (with dot) or int field (with space) are not valid.

How is DataStax Apache Pulsar Connector distributed?

There are two packages:

  • The pulsar-sink functionality of DataStax Apache Pulsar Connector is included with DataStax Luna Streaming. It’s built in!

  • You can optionally download the DataStax Apache Pulsar Connector tarball from the DataStax Downloads site, and then use it as its own product with your open-source Apache Pulsar install.

If you’re using open-source software (OSS) Apache Pulsar, you can use DataStax Apache Pulsar Connector with the OSS to take advantage of this pulsar-sink for Cassandra. See the DataStax Apache Pulsar Connector documentation.

APIs

What client APIs does DataStax Luna Streaming provide?

The same as for Apache Pulsar. See https://pulsar.apache.org/docs/en/client-libraries/.

Next

Learn now to install DataStax Luna Streaming via the Helm Chart or via the Replicated package.