From: Lars Ellenberg Date: Tue, 14 Sep 2010 18:26:27 +0000 (+0200) Subject: drbd: fix for possible deadlock on IO error during resync X-Git-Tag: v2.6.37-rc1~167^2~11 X-Git-Url: http://git.openpandora.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=e9e6f3ec535d7b7c9e2ca64ad691e743e7d3c2f0;p=pandora-kernel.git drbd: fix for possible deadlock on IO error during resync Scenario: Something (say, flush-147:0) is in drbd_al_begin_io, holding a local_cnt, waiting for the resync to make progress. Disk fails, worker in after_state_ch does drbd_rs_cancel_all, then waits for local_cnt to drop to zero. flush-147:0 is woken by drbd_rs_cancel_all, needs to write an AL transaction, and queues that on the worker. Deadlock. Fix: do not wait in the worker, have put_ldev() trigger the state change D_FAILED -> D_DISKLESS when necessary. put_ldev() cannot do the state change directly, as it may or may not already hold various spinlocks. We queue a short work instead. Signed-off-by: Philipp Reisner Signed-off-by: Lars Ellenberg --- Reading git-diff-tree failed