Features

Key features in DataStax Apache Kafka Connector.

DataStax Apache Kafka™ Connector is the bridge that allows data to move from Apache Kafka to DataStax supported database tables in event-driven architectures. Known in the Kafka Connect framework as a sink, the key features of this connector are its performance, flexibility, security, and visibility.

Flexibility

The design of this sink considers the varying data structures that are found in Apache Kafka. Selective mapping functionality in the DataStax Apache Kafka Connector allows the user to specify the Kafka fields that should be written to the database columns. This flexibility allows for a single connector instance to read from multiple Apache Kafka topics and write to many database tables, thereby removing the burden of managing several connector instances. Whether the Apache Kafka data is in Avro, JSON, or string format, the DataStax Apache Kafka Connector extends advanced parsing to account for the wide range of data inputs.

Security

With built-in SSL, LDAP/Active Directory, and Kerberos integration, DataStax Apache Kafka Connector can help achieve strict compliance regulations with the secure connection from client to server.

Visibility

In complex distributed environments, environments are bound to hit points of failure. DataStax Apache Kafka Connector and its integration with the DataStax Drivers handles error conditions. There are metrics included that give the operator visibility into the failure rate and latency indicators as the messages pass from Kafka to DataStax databases.

Highlights of the key features

Feature DataStax Apache Kafka Connector summary
Consume Kafka Primitive data format Accepts Kafka record data that is in primitive type form.
Consume Kafka JSON data format Accepts Kafka record data that is valid JSON form.
Consume Kafka Avro data format Accepts Kafka record data that is valid Avro form.
Pluggable Connect converters Works with StringConverter, JsonConverter, AvroConverter, ByteArrayConverter, and Numeric Converters, as well as custom data converters. Note that the producer of the data must use the same Converter as the connector.
Provides JMX Metrics Exposes JMX metrics for record/failure count, and latency recordings.
Runs within Connect Worker Deployed in the Kafka Connect framework.
At Least Once Guarantee Stores the Kafka offset value during streaming from the Kafka topic to the mapped database table. If any component is restarted, DataStax Apache Kafka Connector resumes the record evaluation where it left off. This feature ensures that no records are missed.
Runs within Connect Worker Deployed in the Kafka Connect framework.
Standalone Mode Support Deployed in Kafka Connect framework and works in standalone mode (meant for dev/test).
Distributed Mode / HA Support Deployed in Kafka Connect framework and works in distributed mode (meant for production).
Flexible Kafka topic to the database's table mapping Extends flexible mapping functionality to control the specific fields that are pulled from Kafka and written to the supported database types.
Single Kafka topic to multiple database tables Enables common denormalization pattern by allowing a single topic to be written to many database tables. The multiple databases must be of the same type.
Connector throttling and parallelism Built-in throttling to limit the maximum concurrent requests that can be sent by a single connector instance. Parallelism is delivered through the integration with the Kafka Connect distributed framework and asynchronous connector internals.
Flexible date/time/timestamp formats Accounts for the typical case where separate teams write to the same Kafka deployment and may use varying formats for date/time fields.
Configurable Consistency Level Allows configuring the consistency levels on a per topic-table basis.
Row-level Time-to-Live (TTL) Allows configuring row-level TTL on a per topic-table basis.
Deletes Allows configuring deletes on a per topic-table basis.
Handling of nulls Allows configuring null handling on a per topic-table basis.
Error handling Built-in error handling for various failure scenarios. These scenarios include bad mappings and database write issues.
Offset management Leverages the Kafka Connect Framework to manage offsets by storing the offset in Kafka.
Connector to database SSL Allows configuring connection to the database with SSL.
Connector to database username/password Allows configuring connection to the database with username/password.
Connector to the database LDAP/Active Directory Allows configuring the connection to the database with LDAP/Active Directory.
Connector to DSE Kerberos Allows configuring the connection to the database with Kerberos.
Configurable write timeout Allows configuring a write timeout to the database.
Connector to database compression Allows configuring connection to the database with compression strategies.