[DLM] do full recover_locks barrier
authorDavid Teigland <teigland@redhat.com>
Wed, 1 Nov 2006 15:31:48 +0000 (09:31 -0600)
committerSteven Whitehouse <swhiteho@redhat.com>
Thu, 30 Nov 2006 15:35:24 +0000 (10:35 -0500)
commit4b77f2c93d052adca8cc8690b9b5e7f8798f4ddd
treeb61c4923c355d36875bf878212dfc1b2f1f0f7ba
parent2cdc98aaf072d573df10c503d3b3b0b74e2a6d06
[DLM] do full recover_locks barrier

Red Hat BZ 211914

The previous patch "[DLM] fix aborted recovery during
node removal" was incomplete as discovered with further testing.  It set
the bit for the RS_LOCKS barrier but did not then wait for the barrier.
This is often ok, but sometimes it will cause yet another recovery hang.
If it's a new node that also has the lowest nodeid that skips the barrier
wait, then it misses the important step of collecting and reporting the
barrier status from the other nodes (which is the job of the low nodeid in
the barrier wait routine).

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
fs/dlm/recoverd.c