Skip to content

Conversation

@cjen1-msft
Copy link
Contributor

Closes #7618

The hypothesis is that a backup had previously acked a regular entry, but was missing any subsequent signature, was partitioned, and became a pre-vote candidate.
When the partition was removed it received an append-entries message and became a follower again, in doing so truncating the previously acked regular entry.

From this point onwards, the backup would keep nacking the append-entries, but the leader who's match_idx for that replica is now higher than the last index in the ledger would never reduce the sent_idx past the match_idx, and so keep sending messages which could not be ack'd by the backup.

The solution is to remove the rollback of uncommittable entries during become_follower.
This leaves the recv_append_entries function fixup path as the only place backups truncate entries, and only until a non-fixup append-entries is received.

Additionally I've moved the last_ack_timeout to after the checks for a valid append_entries_response, as if a leader is unhealthy it should step-down via check-quorum, rather than keeping on being a faulty leader.

Finally I cleaned up a line number in the documentation, and made the invalid view check more explicit.

@cjen1-msft cjen1-msft requested a review from a team as a code owner January 26, 2026 15:38
Copilot AI review requested due to automatic review settings January 26, 2026 15:38
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes issue #7618 where a backup node could get into a perpetual disagreement with the leader after a network partition. The issue occurred when a backup node acked an entry, became a pre-vote candidate during a partition, and then truncated its log when becoming a follower again. This left the leader's match_idx higher than the backup's actual ledger, causing an infinite nack loop.

Changes:

  • Removed rollback of uncommittable entries from become_follower() to prevent premature log truncation
  • Moved last_ack_timeout reset to occur only after successful append-entries validation
  • Improved code clarity by using ccf::VIEW_UNKNOWN constant instead of literal 0

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
tests/raft_scenarios/follower_rollback_match_index New test scenario that reproduces the partition scenario from issue #7618
src/consensus/aft/raft.h Core fix: removed rollback from become_follower, moved last_ack_timeout reset, and improved code clarity with VIEW_UNKNOWN constant

/\ Len(log[logline.msg.state.node_id]) = logline.msg.state.last_idx
\* The log is truncated during BecomeLeader to the last committable index, and so membership state may have also changed
/\ membershipState'[logline.msg.state.node_id] \in ToMembershipState[logline.msg.state.membership_state]
/\ Len(log'[logline.msg.state.node_id]) = logline.msg.state.last_idx
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous behaviour was that when becoming candidate we truncated the ledger to last committable index, and then the truncation during BecomeLeader was a no-op.
Now we only truncate during BecomeLeader, so we need to talk about the subsequent state.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated no new comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistent "Disagreement about primaries" error in partitions_test

3 participants