Server errors

Server Error Origins

Server errors originate at the server and are sent back to the client. Additional information about all of these errors is available in the Apache Cassandra™ native_protocol document

Authentication errors

Description

Authentication was required by the server and failed. The possible reason for failing depends on the authenticator in use and the error message may or may not contain more detail about the failure.

Remediation

Investigate the authentication mechanisms used by Cassandra or DSE and the application. Double check username and passwords. See Authentication in DataStax drivers and review About DSE Unified Authentication for more information.

Unavailable exceptions

Figure 1. Unavailable exception

Description

This error indicates that the consistency level of the query is higher than the number of available replicas to serve that query. This exception contains 3 parts.

CL: The consistency level of the query that triggered the exception.
Required: An integer representing the number of nodes that must be alive to honor the CL.
Alive: An integer representing the number of replicas that were known to be alive when the request had been processed.

Remediation

Ensure that a sufficient number of replicas are available for your consistency level. This error often signals that nodes are down or lacking connectivity to the coordinator. Another possible cause is that the Cassandra or DSE cluster is in the middle of a rolling restart or upgrade. When operators are performing a rolling upgrade or restart, ensure that the previous node is fully up and ready to receive query requests before restarting the next node in the procedure.

Overloaded exceptions

Description

The request cannot be processed because the coordinator node is overloaded by requests.

Remediation

Overloaded exceptions signal that the cluster can not handle the incoming traffic from clients. This can be triggered during spikes in traffic or due to expensive queries exhausting the node’s resources. This typically indicates an under provisioned cluster.

Write timeouts

Figure 2. Write timeout exceptions

Description

Write timeouts signal that a server side timeout exception occurred during the write request. This error contains 4 parts.

CL: The consistency level of the query that triggered the write timeout.
received: The number of nodes that acknowledged the request.
blockfor: An integer that represents the number of replicas required to satisfy the consistency level.
writetype: A string that describes the type of write that timed out. Below are the different types of writes.
- SIMPLE
- BATCH
- BATCH_LOG
- UNLOGGED_BATCH
- COUNTER
- CAS
- VIEW (MV)
- CDC

Remediation

When this happens on a non-idempotent write, such as incrementing a counter, caution must be exercised by the client, as the data may or may not have been written to the table by the node. With an idempotent write, the write can simply be retried. Only batchlog writes are retried by the driver’s default retry policy. A query’s idempotence can be defined in the application. See the individual driver documentation for the API specifics.

Depending on the SLAs and application requirements, the default server side write timeout may not be adequate and this value can be adjusted in the cassandra.yaml configuration file. The location of this file depends on the type of installation:

Package installations: /etc/dse/cassandra/cassandra.yaml
Tarball installations: <installation_location>/resources/cassandra/conf/cassandra.yaml

One common case when write timeouts surface is when batches are large or span multiple partitions. To address this, consider decreasing the batch size and limiting batch writes to a single partition. See this blog post for more details on correctly handling this error.

Read timeouts

Description

Read timeouts signal that a server side timeout exception occurred during the read request. This error contains 4 parts.

CL: The consistency level of the query that triggered the read timeout.
received: The number of nodes that acknowledged the request.
blockfor: An integer that represents the number of replicas required to satisfy the consistency level.
data present: If this value is 0 it means the replica that was asked for the data did not respond. Otherwise the value is not 0. The coordinator only asks a single node for the data and uses a checksum from the other nodes to determine if the data is consistent.

Remediation

Read timeouts can occur for a variety of reasons. Some possible causes are if the query is requesting a very large amount of data at all once or if there are long server side garbage collection events occurring. This typically indicates issues with the data model or query patterns that are causing poor performance on the server. To debug, first verify in the server logs that garbage collection times are acceptable and then examine the data model and access patterns. The server side read timeout can be altered in cassandra.yaml if no other underlying cause of the timeouts can be diagnosed.

Read failures

Description

A read failure is a non-timeout exception encountered during a read request. This error contains 5 parts.

CL: The consistency level of the query that triggered the error.
received: The number of nodes that acknowledged the request.
blockfor: An integer that represents the number of replicas required to satisfy the consistency level.
reasonmap: A map of endpoint to failure reason codes. This maps the endpoints of the replica nodes that failed executing the request to the code representing the reason for the failure.
data present: If this value is 0 it means the replica that was asked for the data did not respond. Otherwise the value is not 0. The coordinator only asks a single node for the data and uses a checksum from the other nodes to determine if the data is consistent.

Remediation

This error is rarely encountered. Investigate the reason map to find to the root cause. The most common cause for this type of error is when too many tombstones are read during the request.

Write failures

Description

A write failure is a non-timeout exception encountered during a write request. This error contains 5 parts.

CL: The consistency level of the query that triggered the error.
received: The number of nodes that acknowledged the request.
blockfor: An integer that represents the number of replicas required to satisfy the consistency level.
reasonmap: A map of endpoint to failure reason codes. This maps the endpoints of the replica nodes that failed executing the request to the code representing the reason for the failure.
writeType: A string that describes the type of write that failed. The value of the string describes the type of write that failed. Below are the different types of writes.
- SIMPLE
- BATCH
- BATCH_LOG
- UNLOGGED_BATCH
- COUNTER
- CAS
- VIEW (MV)
- CDC

Remediation

This error is rarely encountered. Examine the reason map to find to the root cause. The most common cause for this type of error is when batch sizes are too large.

Function failures

Description

A user defined function (UDF) failed during execution. The error message contains the following information.

keyspace: The keyspace of the failed function.
function: The name of the failed function.
arg_types: A list of argument types of the failed function.

Remediation

It is likely that something is logically wrong with the user defined function, such as an infinite loop or syntax error. Scrutinize the UDF definition to find the issue.

Syntax errors

Description

The submitted query contains invalid syntax.

Remediation

Ensure the CQL has correct syntax.

Invalid errors

Description

The submitted query is syntactically correct, but is not a valid query.

Remediation

Ensure the query is valid. Examples of syntactically correct but invalid queries include trying to set the keyspace to a nonexistent keyspace or querying a table that does not exist.

Already exists errors

Description

The query attempted to create a keyspace or table that already exists. This error contains 2 parts.

ks: The keyspace associated with the keyspace or table that already exists.
table: The name of the table that already exists. If no table is involved this is empty.

Remediation

Make sure the keyspace or table does not exist before trying to create it or use the IF NOT EXISTS CQL syntax.

Unprepared errors

Description

The execution of a prepared statement was attempted when the statement was not prepared in advance.

Remediation

Prepare the statement before executing it.