• Glossary
  • Support
  • Downloads
  • DataStax Home
Get Live Help
Expand All
Collapse All

DataStax Pulsar Connector

    • Getting Started
      • About the DataStax Apache Pulsar™ Connector
        • System requirements
      • Pulsar Connector release notes
      • Installing DataStax Apache Pulsar™ Connector
      • Pulsar Connector single instance quick start
    • Guides and examples
      • Configuration
        • Configuring parallelism
        • Specify writetime timestamp column
        • Setting row-level TTL values from Pulsar fields
        • Pass Pulsar Connector settings directly to the DataStax Java driver
        • Mapping pulsar topics to database tables
          • Determining topic data structure
          • Mapping basic messages to table columns
          • Mapping a message that contain JSON fields
            • Mapping a message that contains both basic and JSON fields
            • Mapping JSON messages
          • Mapping Avro messages
          • Extract Pulsar record header values
          • Mapping messages to table that has a User Defined Type
          • Mapping a topic to multiple tables
          • Multiple topics to multiple tables
          • Provide CQL queries in mappings
          • The now() function in mappings
      • Operations
        • About operating and maintaining the DataStax Connector
        • Scaling the DataStax Apache Pulsar™ Connector
        • Changing the topic or table schema
        • Restarting the DataStax Apache Pulsar™ Connector
        • Displaying the DataStax Apache Pulsar™ Connector configuration
        • Updating the DataStax Apache Pulsar™ Connector configuration
        • Deleting the DataStax Apache Pulsar™ Connector
        • Getting the DataStax Connector status
      • Security
        • Using internal or LDAP authentication
      • DataStax Apache Pulsar™ Connector metrics
      • Troubleshooting
        • Record fails to write
        • Writing fails because of mutation size
        • Data parsing fails
        • Loading balancing datacenter is not specified
    • Reference
      • DataStax Apache Pulsar™ Connector details
      • DataStax connection
      • Pulsar topic-to-table settings
      • Converting date and times for a topic
      • Using the DataStax Apache Pulsar™ Connector with DataStax Enterprise authentication
        • Internal or LDAP authentication
      • SSL encrypted connection
      • Configure error handling
  • DataStax Pulsar Connector
  • Guides and examples
  • Operations
  • Scaling the DataStax Apache Pulsar™ Connector

Scaling the DataStax Apache Pulsar™ Connector

Use the Apache Pulsar™ administration tool to increase or decrease the number of workers to run for a given sink using the parallelism factor. You can specify the parallelism factor during creation of a Pulsar sink, and you can modify it after the fact as well. The default parallelism factor is 1.

Configuring parallelism during sink creation

To configure parallelism during sink creation, add the --parallelism flag to the pulsar-admin sinks create command and specify the number of workers:

Example create a Pulsar sink with a parallelism factor of 3:

bin/pulsar-admin sinks create \
	--name dse-sink-kv \
  --classname com.datastax.oss.sink.pulsar.StringCassandraSinkTask \
	--parallelism: 3 \
	--sink-config-file conf/qs.yml \
	--sink-type cassandra-enhanced \
	--tenant public \
	--namespace default \
	--inputs "persistent://public/default/example_topic"
"Created successfully"

The sink will run three parallel Pulsar workers.

Modifying parallelism post sink creation

To modify the parallelism factor of an existing sink, you can use pulsar-admin sinks update command and increase or reduce the factor as required:

Example change the parallelism factor of an existing sink:

bin/pulsar-admin sinks update --name dse-sink-kv --parallelism 1
"Updated successfully"

The sink will terminate all but a single Pulsar worker.

About operating and maintaining the DataStax Connector Changing the topic or table schema

General Inquiries: +1 (650) 389-6000 info@datastax.com

© DataStax | Privacy policy | Terms of use

DataStax, Titan, and TitanDB are registered trademarks of DataStax, Inc. and its subsidiaries in the United States and/or other countries.

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.

Kubernetes is the registered trademark of the Linux Foundation.

landing_page landingpage