Purging gossip state on a node

Correcting a problem in the gossip state.

Gossip information is persisted locally by each node to use immediately on node restart without having to wait for gossip communications.

Procedure

In the unlikely event you need to correct a problem in the gossip state:

  1. Using MX4J or JConsole, connect to the node's JMX port and then use the JMX method Gossiper.unsafeAssassinateEndpoints(ip_address) to assassinate the problem node.

    This takes a few seconds to complete so wait for confirmation that the node is deleted.

  2. If the JMX method above doesn't solve the problem, stop your client application from sending writes to the cluster.
  3. Take the entire cluster offline:
    1. Drain each node.
      C:\> %CASSANDRA_HOME%\nodetool options drain
    2. Stop each node:
      C:\> net stop DataStax_Cassandra_Community_server
  4. Clear the data from the peers directory:
    C:\> rmdir "Program Files\DataStax Community\data\data\system\peers\*" /s
    CAUTION:
    Use caution when performing this step. The action clears internal system data from Cassandra and may cause application outage without careful execution and validation of the results. To validate the results, run the following query individually on each node to confirm that all of the nodes are able to see all other nodes.
    select * from system.peers;
  5. Add the following line to the cassandra-env.ps1 file to clear the gossip state when the node starts:
    JVM_OPTS="$env:JVM_OPTS -Dcassandra.load_ring_state=false"
    The location of the cassandra-env.ps1 is:
    Windows installations C:\Program Files\DataStax Community\apache-cassandra\conf\cassandra-env.ps1
  6. Bring the cluster online one node at a time, starting with the seed nodes.
    C:\> net start DataStax_Cassandra_Community_server

What's next

Remove the line you added in the cassandra-env.ps1 file.