Consuming change data with Apache Pulsar™

This article is a continuation of the Change Data Capture (CDC) pattern with Apache Cassandra® and Apache Pulsar™ article. Please read that article first to understand the fundamentals of what resources are being used.

Pulsar clients

Each client handles message consumption a little differently but there is one overall pattern to follow. As we learned in the previous sections, a CDC message will arrive as an Avro GenericRecord of type KeyValue. Typically, the first step will be to separate the key and value portions of the message. You will find the Cassandra table’s key fields in the key of the record and the change data in the value portion of the record. Both of which are Avro records themselves. From there you’ll want to deserialize the Avro record and extract the interesting info.

Below are example implementations for each runtime consuming messages from the CDC data topic.

While these examples are in the “astra-streaming-examples” repository, they are not Astra Streaming-specific. You can use these examples to consume CDC data topics in your own Cassandra/Pulsar clusters.

Pulsar functions

It is very common to have a function consuming the CDC data. Functions usually perform additional processing on the data and pass it to another topic. Similar to a client consumer, it will need to deserialize the message data. Below are examples of different functions consuming messages from the CDC data topic.

Golang

Java

Python

You’re ready to tackle CDC like a pro! Use our CDC questions and patterns with Apache Cassandra® and Apache Pulsar™ as reference as you near production.

Consuming change data with Apache Pulsar™

Pulsar clients

Pulsar functions

Next

Was this helpful?

Give Feedback