From 1471e241a33abc1f0774ce7ec418a134b655f9ac Mon Sep 17 00:00:00 2001 From: David Teigland Date: Oct 11 2018 18:05:13 +0000 Subject: sanlock: fix missing dblock return from paxos_lease_acquire In commit 933a474339ded8d025c99356f583d228b6a2b9ba sanlock: fix release interference with paxos The function write_mblock_shared_dblock_release() was added to preserve the current dblock values when setting the mode block in the same sector. The latest copy of the dblock that was returned from paxos_lease_acquire was saved and copied into the buffer used to write the mode block. The problem is that the latest copy of the dblock was not always being returned from paxos_lease_acquire [1]. The result is that a host was sometimes copying garbage dblock values back into its dblock. If other hosts then tried to acquire this same lock, they could get confused by the invalid dblock values. The specific problem that was seen is where the bad dblock lver value was larger than the lver value in the leader record. This would cause paxos_lease_acquire on the other host to repeatedly abort paxos_lease_acquire because of the "larger lver in bk" (believing another host was competing for a newer version of the lease.) [1] The failure to return the latest dblock copy was introduced in commit 6501351b742: sanlock: preserve dblock values when setting shared flag The specific problem in that commit is when another host commits us as the lease owner, in which case paxos_lease_acquire missed returning its latest dblock. --- diff --git a/src/paxos_lease.c b/src/paxos_lease.c index ed16f41..3acb829 100644 --- a/src/paxos_lease.c +++ b/src/paxos_lease.c @@ -1986,6 +1986,7 @@ int paxos_lease_acquire(struct task *task, (unsigned long long)tmp_leader.write_id); memcpy(leader_ret, &tmp_leader, sizeof(struct leader_record)); + memcpy(dblock_ret, &dblock, sizeof(struct paxos_dblock)); error = SANLK_OK; } else { /* not a problem, but interesting to see */ diff --git a/src/resource.c b/src/resource.c index d0da60b..07bcfb6 100644 --- a/src/resource.c +++ b/src/resource.c @@ -1594,6 +1594,8 @@ int acquire_token(struct task *task, struct token *token, uint32_t cmd_flags, int new_id = 0; int rv; + memset(&dblock, 0, sizeof(dblock)); + if (token->acquire_flags & SANLK_RES_LVER) acquire_lver = token->acquire_lver; if (token->acquire_flags & SANLK_RES_NUM_HOSTS)