Roll back an upgrade

This page describes the procedure for rolling back an in-process upgrade to Apache Cassandra.

If you encounter problems by the time you reach Phase 5: Decide to continue or abandon the upgrade, such as application errors, nodes failing to start, or unexplainable behavior that prevent you from completing the upgrade, you have the option to roll back the cluster to the previous version of Cassandra.

Step 1: Shut down Cassandra

Stop the Cassandra service on the upgraded node.

  1. Drain the node.

    • Command

    • Result

    nodetool drain

    The nodetool drain command doesn’t return any output.

    You can monitor drain progress by checking the Cassandra system.log file for messages similar to the following:

    INFO  [RMI TCP Connection(4)-127.0.0.1] 2023-06-01 03:59:37,442 StorageService.java:1660 - DRAINING: starting drain process
    INFO  [RMI TCP Connection(4)-127.0.0.1] 2023-06-01 03:59:37,443 HintsService.java:210 - Paused hints dispatch
    INFO  [RMI TCP Connection(4)-127.0.0.1] 2023-06-01 03:59:37,449 Server.java:179 - Stop listening for CQL clients
    INFO  [RMI TCP Connection(4)-127.0.0.1] 2023-06-01 03:59:37,449 Gossiper.java:1720 - Announcing shutdown
    INFO  [RMI TCP Connection(4)-127.0.0.1] 2023-06-01 03:59:37,465 StorageService.java:2585 - Node /10.166.73.33 state jump to shutdown
    INFO  [RMI TCP Connection(4)-127.0.0.1] 2023-06-01 03:59:39,469 MessagingService.java:985 - Waiting for messaging service to quiesce
    INFO  [ACCEPT-/10.166.72.33] 2023-06-01 03:59:39,470 MessagingService.java:1346 - MessagingService has terminated the accept() thread
    INFO  [RMI TCP Connection(4)-127.0.0.1] 2023-06-01 03:59:39,794 HintsService.java:210 - Paused hints dispatch
    INFO  [RMI TCP Connection(4)-127.0.0.1] 2023-06-01 03:59:39,806 StorageService.java:1660 - DRAINED

    You can also confirm the status of the drain by running the nodetool netstats command and checking for Mode: DRAINED in the output.

    • Command

    • Result

    nodetool netstats

    The drain was successful if you see Mode: DRAINED in the output.

    Mode: DRAINED
    Not sending any streams.
    Read Repair Statistics:
    Attempted: 0
    Mismatch (Blocking): 0
    Mismatch (Background): 0
    Pool Name                    Active   Pending      Completed   Dropped
    Large messages                  n/a         2              0         0
    Small messages                  n/a         2              5         0
    Gossip messages                 n/a         2            122         0

    When you run the nodetool drain command, Cassandra stops listening for connections from clients and other nodes (no more data is written to the node) and all memtables are flushed to SSTables on disk. This ensures that all the data on the node is safely stored on disk before beginning the upgrade.

    After running nodetool drain, the node will not be able to service client reads or writes until Cassandra is restarted.

  2. Stop Cassandra.

    • Package installations

    • Tarball installations

    To stop the Cassandra service for packaged installations:

    sudo service cassandra stop

    To stop the Cassandra process for tarball installations:

    sudo kill $(ps auwx | grep cassandra | grep -v "grep" | tr -s ' ' | cut -d' ' -f2)
  3. Ensure that no Cassandra process is running on the server.

    • Command

    • Result

    ps auwx | grep CassandraDaemon
    root       27921  0.0  0.0   3304   656 pts/0    S+   07:38   0:00 grep --color=auto CassandraDaemon

    Note that Cassandra might take some time to shut down, especially if it’s currently handling requests. If the service continues to run, then kill the process using the following command:

    sudo kill -9 $(ps auwx | grep CassandraDaemon | grep -v "grep" | tr -s ' ' | cut -d' ' -f2)

Step 2: Downgrade to the previously installed version of Cassandra

Downgrade the Cassandra version using the same methodology and tooling you used for the upgrade.

If you’re using a configuration management system, you should now "converge" the node with the old version of the manifest, cookbook, etc.

If no configuration management is used, replace the binary or install the new Cassandra version. Commands for the different linux distributions are below.

If no configuration management is used, replace the binary or install the old Cassandra version. Commands for different Linux distributions are described below.

Ensure that the packaging system and configurations have been rolled back to the previous version of Cassandra. This varies on different linux distributions and containerization platforms.

  • Debian/Ubuntu (APT)

  • CentOS/RHEL (YUM)

  • Tarball

  • Docker

  1. Ensure that the previous version of Cassandra is the version used in the cassandra.sources.list file.

    For example, if the previous version is 3.11.15, then the corresponding distribution name is 311x (with an "x" as the suffix). To update the repository for version 3.11.15 (311x):

    • Command

    • Result

    echo "deb https://debian.cassandra.apache.org 311x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
    deb https://debian.cassandra.apache.org 311x main
  2. Add the Apache Cassandra repository keys to the list of trusted keys on the server:

    • cURL

    • Wget

    • Result

    curl https://downloads.apache.org/cassandra/KEYS | sudo apt-key add -
    wget https://downloads.apache.org/cassandra/KEYS | sudo apt-key add -
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                     Dload  Upload   Total   Spent    Left  Speed
    100  278k  100  278k    0     0   168k      0  0:00:01  0:00:01 --:--:--  168k
    OK
  3. Update the package index from sources:

    sudo apt-get update
  4. Remove the upgraded version of Cassandra:

    sudo apt remove cassandra
  5. Install the previous version of Cassandra:

    sudo apt-get install cassandra=3.11.15
  1. Update the Apache Cassandra repository information in the /etc/yum.repos.d/cassandra.repo file (as the root user) to ensure that the previous version of Cassandra is the version used by the packaging system.

    For example, if the previous version is 3.11.15, then the corresponding distribution name is 311x (with an "x" as the suffix). To update the repository for version 3.11.15 (311x), make sure the content of cassandra.repo matches the following:

    [cassandra]
    name=Apache Cassandra
    baseurl=https://redhat.cassandra.apache.org/311x/
    gpgcheck=1
    repo_gpgcheck=1
    gpgkey=https://downloads.apache.org/cassandra/KEYS
  2. Update the package index.

    • Command

    • Result

    sudo yum update
    Apache Cassandra                                817  B/s | 833  B     00:01
    Apache Cassandra                                199 kB/s | 275 kB     00:01
    Importing GPG key 0xF2833C93:
     Userid     : "Eric Evans <eevans@sym-link.com>"
     Fingerprint: CEC8 6BB4 A0BA 9D0F 9039 7CAE F835 8FA2 F283 3C93
     From       : https://downloads.apache.org/cassandra/KEYS
    Is this ok [y/N]:

    Type Y and press Return to import each of the GPG keys.

  3. Remove the upgraded version of Cassandra:

    sudo yum remove cassandra
  4. Install the previous version of Cassandra:

    sudo yum install cassandra-3.11.15-1

    Type Y and press Return to begin the installation.

    After the packages have downloaded, you’ll be asked to import GPG keys. Type Y and press Return to import each of the GPG keys, after which the installation will continue until completion.

  1. In the directory where Cassandra is installed, run the following command to delete the contents of the directory except for the data directory.

    rm -vfr !(data)
  2. Downloading the binary tarball for the previous version of Cassandra. For example, to download Cassandra 3.11.15:

    • cURL

    • Wget

    curl -OL https://archive.apache.org/dist/cassandra/3.11.15/apache-cassandra-3.11.15-bin.tar.gz
    wget https://archive.apache.org/dist/cassandra/3.11.15/apache-cassandra-3.11.15-bin.tar.gz

    To download a different version of Cassandra, visit the Apache Archives.

  3. Unpack the tarball:

    tar xzf apache-cassandra-3.11.15-bin.tar.gz

    The files will be extracted to the apache-cassandra-3.11.15 directory. This is the tarball installation location.

  4. Move the apache-cassandra-3.11.15 directory to the same location as your current installation of Cassandra. For example:

    mv apache-cassandra-3.11.15 /usr/local/cassandra-3
  5. Update your PATH and environment variables to point to the new installation. For example:

    export PATH="/usr/bin:/usr/local/cassandra-3/bin:/usr/local/cassandra-3/tools/bin:$PATH"
  6. Delete the tarball.

    rm apache-cassandra-3.11.15-bin.tar.gz

On a Docker cluster, the rollback scenario differs. Starting up the previous containers is the simplest approach. Starting up new containers on the previous Docker image is also possible, but concerns and challenges listed in Docker considerations above must be adhered to.

Step 3: Delete Cassandra operation data

Delete the operational files created by Cassandra when it was running. Specifically, the hints, commitlog, and saved_caches directories. The deletion commands in this step assume all three directories are located in the default directory path used by the package installation: /var/lib/cassandra.

The paths for the hints, commitlog, and saved_cache directories can be respectively defined in the cassandra.yaml file.

Setting in cassandra.yaml Corresponding operational directory

hints_directory

hints

commitlog_directory

commitlog

saved_caches_directory

saved_caches

Run the following commands to delete the operational files created by Cassandra. If any of the settings in the above table are defined in the cassandra.yaml file, then use that path value for the corresponding operational directory.

sudo rm /var/lib/cassandra/hints/*
sudo rm /var/lib/cassandra/commitlog/*
sudo rm /var/lib/cassandra/saved_caches/*

Step 4: Restore the snapshot files

Find and delete any new SSTables created by the node. If upgrading to Cassandra 4.x, these SSTables will be in the Cassandra 4.0+ format.

CASSANDRA_DATA=/full/path/to/cassandra/data/
sudo find ${CASSANDRA_DATA} \
    -type f \
    -iname "nb-*" \
    -exec bash -c "rm -f {}" \;

Restore the snapshot files for each table stored on the node.

CASSANDRA_DATA=/full/path/to/cassandra/data/
sudo find ${CASSANDRA_DATA} \
    -type d \
    -iname "pre-40-upgrade*" \
    -exec bash -c "cp -p {}/* {}/../../" \;

Step 5: Restore the configuration files

Decompress the configuration file backups you made previously and restore the configurations for the previous version of Cassandra.

cd /etc/cassandra
sudo tar xzf ~/cassandra-config-backup.tgz

Step 6: Start Cassandra

Start the Cassandra service on the downgraded node using the following command (or an equivalent on your system):

  • Package installations

  • Tarball installations

To start the Cassandra service for packaged installations:

sudo service cassandra start

To start the Cassandra process for tarball installations:

<install-location>/bin/cassandra

After a few seconds, the internal Cassandra processes will come online.

Step 7: Confirm Cassandra status

After starting the downgraded Cassandra service, check the status of the node while also continually monitoring your applications for signs of degraded performance and functionality.

  1. Monitor /var/log/cassandra/system.log to confirm there are no ERROR or WARN statements.

    • Command

    • Result

    sudo tail -n 50 -f /var/log/cassandra/system.log

    Messages similar to the following will appear in system.log when the Cassandra process starts:

    INFO  [main] 2023-06-01 04:44:45,897 SystemKeyspace.java:1729 - Detected version upgrade from 3.11.15 to 4.1.2, snapshotting system keyspaces
    ...
    INFO  [main] 2023-06-01 05:06:00,630 StorageService.java:864 - Cassandra version: 4.1.2
  2. Confirm that the node is reporting status UN (Up and Normal).

    • Command

    • Result

    nodetool status
    Datacenter: datacenter1
    =======================
    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address        Load        Tokens  Owns (effective)  Host ID                               Rack
    UN  10.166.73.33   360.5 KiB   256     28.2%             7370b2ef-c9c3-4c83-bb44-0879cef0f6c8  rack1
    UN  10.166.76.162  301.28 KiB  256     34.3%             e8f291db-b3ec-47fa-8bee-27e033c9655f  rack1
    UN  10.166.77.78   132 KiB     256     37.4%             baae5ba3-a842-4b23-a70b-069364dac689  rack1

    It’s important to check that the other nodes see the node as UP, and also that the node sees all other nodes as UP. Therefore, you should run nodetool status both on the node and on at least one other node. The output should be the same on both nodes.

  3. Confirm that the node is using the intended version of Cassandra.

    • Command

    • Result

    nodetool version
    ReleaseVersion: 4.1.2
  4. Confirm that the node is processing read and write traffic as well as requests from clients. You can confirm this by watching for Completed tasks in the thread pool stats:

    • Command

    • Result

    watch -d nodetool tpstats
    Every 2.0s: nodetool tpstats
    
    Pool Name                    Active Pending Completed Blocked All time blocked
    RequestResponseStage         0      0       3         0       0
    ReadStage                    0      0       3         0       0
    CompactionExecutor           0      0       6422      0       0
    MemtableReclaimMemory        0      0       44        0       0
    PendingRangeCalculator       0      0       8         0       0
    GossipStage                  0      0       36429     0       0
    SecondaryIndexManagement     0      0       1         0       0
    HintsDispatcher              0      0       0         0       0
    MigrationStage               0      0       34        0       0
    MemtablePostFlush            0      0       60        0       0
    PerDiskMemtableFlushWriter_0 0      0       33        0       0
    ValidationExecutor           0      0       0         0       0
    Sampler                      0      0       0         0       0
    ViewBuildExecutor            0      0       0         0       0
    MemtableFlushWriter          0      0       44        0       0
    CacheCleanupExecutor         0      0       0         0       0
    Native-Transport-Requests    0      0       0         0       0
    
    Latencies waiting in queue (micros) per dropped message types
    Message type                      Dropped     50%      95%      99%      Max
    READ_RSP                          0           0.0      0.0      0.0      0.0
    RANGE_REQ                         0           0.0      0.0      0.0      0.0
    PING_REQ                          0           0.0      0.0      0.0      0.0
    PAXOS2_COMMIT_REMOTE_RSP          0           0.0      0.0      0.0      0.0
    PAXOS2_COMMIT_AND_PREPARE_RSP     0           0.0      0.0      0.0      0.0

    Specifically, you should confirm that counters for the following tasks are increasing:

    • ReadStage: local read tasks

    • MutationStage: local writes tasks

    • RequestResponseStage: tasks that process the responses from replicas when acting as a coordinator

    If all of these counters are increasing, it indicates that the node is successfully processing traffic and communicating properly with the rest of the cluster. There should be no pending, blocked, or dropped messages.

Step 8: Run a repair

This node will experience a data loss. Any new data written to it while the new version of Cassandra was running will be lost. In addition, this data will be absent from the pre-upgrade snapshot as it was taken prior to installing the new version of Cassandra. Data written to the cluster during this time will likely be in an inconsistent state.

To resolve this issue, run a repair on the node using Reaper. A new repair will need to be configured and launched for each non-system keyspace on the node.

Step 9: Roll back remaining upgraded nodes

Repeat the previous steps on each node that had been upgraded to the new version of Cassandra until all nodes in the cluster have been downgraded to the previously installed version of Cassandra.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com