• Glossary
  • Support
  • Downloads
  • DataStax Home
Get Live Help
Expand All
Collapse All

DataStax Pulsar Connector

    • Getting Started
      • About the DataStax Apache Pulsar™ Connector
        • System requirements
      • Pulsar Connector release notes
      • Installing DataStax Apache Pulsar™ Connector
      • Pulsar Connector single instance quick start
    • Guides and examples
      • Configuration
        • Configuring parallelism
        • Specify writetime timestamp column
        • Setting row-level TTL values from Pulsar fields
        • Pass Pulsar Connector settings directly to the DataStax Java driver
        • Mapping pulsar topics to database tables
          • Determining topic data structure
          • Mapping basic messages to table columns
          • Mapping a message that contain JSON fields
            • Mapping a message that contains both basic and JSON fields
            • Mapping JSON messages
          • Mapping Avro messages
          • Extract Pulsar record header values
          • Mapping messages to table that has a User Defined Type
          • Mapping a topic to multiple tables
          • Multiple topics to multiple tables
          • Provide CQL queries in mappings
          • The now() function in mappings
      • Operations
        • About operating and maintaining the DataStax Connector
        • Scaling the DataStax Apache Pulsar™ Connector
        • Changing the topic or table schema
        • Restarting the DataStax Apache Pulsar™ Connector
        • Displaying the DataStax Apache Pulsar™ Connector configuration
        • Updating the DataStax Apache Pulsar™ Connector configuration
        • Deleting the DataStax Apache Pulsar™ Connector
        • Getting the DataStax Connector status
      • Security
        • Using internal or LDAP authentication
      • DataStax Apache Pulsar™ Connector metrics
      • Troubleshooting
        • Record fails to write
        • Writing fails because of mutation size
        • Data parsing fails
        • Loading balancing datacenter is not specified
    • Reference
      • DataStax Apache Pulsar™ Connector details
      • DataStax connection
      • Pulsar topic-to-table settings
      • Converting date and times for a topic
      • Using the DataStax Apache Pulsar™ Connector with DataStax Enterprise authentication
        • Internal or LDAP authentication
      • SSL encrypted connection
      • Configure error handling
  • DataStax Pulsar Connector
  • Guides and examples
  • Troubleshooting
  • Writing fails because of mutation size

Writing fails because of mutation size

The DataStax Apache Pulsar™ Connector collects records to write to the DataStax Enterprise (DSE) database in CQL BATCH commands. Data insertions and deletions in the records are known as mutations. The Connector is implemented to use single-partition batches that are submitted as one mutation operation.

For a given batch, if the total size of the mutation exceeds the maximum allowed by DSE (max_mutation_size_in_kb) the batch is rejected and the error message is written to the system.log, by default in var/log/cassandra.

For example:

Mutation of 28087887 bytes is too large for the maximum size of 16777216

Remediation

Before making any changes, understand the relationship between the max_mutation_size_in_kb and commitlog_segment_size_in_mb settings. The max_mutation_size_in_kb value is calculated as half of commitlog_segment_size_in_mb.

Defaults when not set explicitly in cassandra.yaml:

  • commitlog_segment_size_in_mb: 32 MB

  • max_mutation_size_in_kb: 16384 (16 MB)

If you set max_mutation_size_in_kb explicitly, also set commitlog_segment_size_in_mb to:

2 * max_mutation_size_in_kb / 1024

You can decrease the batch size by lowering the number of records collected in each batch (maxNumberOfRecordsInBatch). The default is 32.

If you cannot decrease the size of your batches, test whether increases to max_mutation_size_in_kb and commitlog_segment_size_in_mb result in batches completing successfully, without consuming too much RAM on the partition’s node. Also increase the DSE database batch threshold using batch_size_warn_threshold_in_kb.

Investigate why the mutations are larger than expected. Look for underlying issues with your client application, access patterns, and data model, because increasing the commitlog segment size is a limited fix.

Record fails to write Data parsing fails

General Inquiries: +1 (650) 389-6000 info@datastax.com

© DataStax | Privacy policy | Terms of use

DataStax, Titan, and TitanDB are registered trademarks of DataStax, Inc. and its subsidiaries in the United States and/or other countries.

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.

Kubernetes is the registered trademark of the Linux Foundation.

landing_page landingpage