From: Krzysztof Wojcik Date: Fri, 4 Feb 2011 13:18:26 +0000 (+0100) Subject: FIX: md: process hangs at wait_barrier after 0->10 takeover X-Git-Tag: v2.6.38-rc5~55^2 X-Git-Url: http://git.openpandora.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=02214dc5461c36da26a34014cab4e1bb484edba2;p=pandora-kernel.git FIX: md: process hangs at wait_barrier after 0->10 takeover Following symptoms were observed: 1. After raid0->raid10 takeover operation we have array with 2 missing disks. When we add disk for rebuild, recovery process starts as expected but it does not finish- it stops at about 90%, md126_resync process hangs in "D" state. 2. Similar behavior is when we have mounted raid0 array and we execute takeover to raid10. After this when we try to unmount array- it causes process umount hangs in "D" In scenarios above processes hang at the same function- wait_barrier in raid10.c. Process waits in macro "wait_event_lock_irq" until the "!conf->barrier" condition will be true. In scenarios above it never happens. Reason was that at the end of level_store, after calling pers->run, we call mddev_resume. This calls pers->quiesce(mddev, 0) with RAID10, that calls lower_barrier. However raise_barrier hadn't been called on that 'conf' yet, so conf->barrier becomes negative, which is bad. This patch introduces setting conf->barrier=1 after takeover operation. It prevents to become barrier negative after call lower_barrier(). Signed-off-by: Krzysztof Wojcik Signed-off-by: NeilBrown --- Reading git-diff-tree failed