8000 storage: Possible race between RequestLease and ChangeReplicas · Issue #15385 · cockroachdb/cockroach · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
storage: Possible race between RequestLease and ChangeReplicas #15385
Open
@bdarnell

Description

@bdarnell

As discussed in #15355, TransferLease is now sequenced with respect to concurrent replica changes by the command queue. RequestLease has no such protection because it is evaluated on followers (unlike all other commands). It is instead guarded by the RaftCommand.ProposerLease field, which ensures that the replica requesting the lease has an up-to-date view of the current lease. This allows for a race in which a replica is removed from the range immediately after it has attempted to take the lease. (This race is difficult to hit in practice because the range must be healthy to execute the ChangeReplicas transaction, but the current lease holder must be (or appear to be) unhealthy in order for a follower to attempt to grab the lease) When this occurs, we will hit the log.Fatal which prevents ranges from getting stuck with a lease on a non-member store.

The simplest fix I see is to add a counter to roachpb.Lease which is incremented on every replica change, so that the ProposerLease will be seen as outdated when a RequestLease crosses over a rebalance.

Jira issue: CRDB-44008

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-kv-replicationRelating to Raft, consensus, and coordination.C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.branch-release-20.1Used to mark GA and release blockers, technical advisories, and bugs for 20.1

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0