Pandora Sourcecodes - pandora-kernel.git/commit

author	Lars Ellenberg <lars.ellenberg@linbit.com>
	Sun, 19 Dec 2010 10:29:55 +0000 (11:29 +0100)
committer	Philipp Reisner <philipp.reisner@linbit.com>
	Thu, 10 Mar 2011 10:45:08 +0000 (11:45 +0100)
commit	725a97e43ee945cc813fffd9e628e50d703b973b
tree	ec67dbfccf0b3a43cb879056a1fb320b82b8dd2d	tree \| snapshot
parent	06d33e968d2c58143a7aaafa8963cf6a58099467	commit \| diff

drbd: fix potential access of on-stack wait_queue_head_t after return

I run into something declaring itself as "spinlock deadlock",
BUG: spinlock lockup on CPU#1, kjournald/27816, ffff88000ad6bca0
Pid: 27816, comm: kjournald Tainted: G        W 2.6.34.6 #2
Call Trace:
  <IRQ>  [<ffffffff811ba0aa>] do_raw_spin_lock+0x11e/0x14d
  [<ffffffff81340fde>] _raw_spin_lock_irqsave+0x6a/0x81
  [<ffffffff8103b694>] ? __wake_up+0x22/0x50
  [<ffffffff8103b694>] __wake_up+0x22/0x50
  [<ffffffffa07ff661>] bm_async_io_complete+0x258/0x299 [drbd]
but the call traces do not fit at all,
all other cpus are cpu_idle.

I think it may be this race:

drbd_bm_write_page
wait_queue_head_t io_wait;
atomic_t in_flight;
bm_async_io
  submit_bio
bm_async_io_complete
  if (atomic_dec_and_test(in_flight))
wait_event(io_wait,
atomic_read(in_flight) == 0)
return
    wake_up(io_wait)

The wake_up now accesses the wait_queue_head_t spinlock, which is no
longer valid, since the stack frame of drbd_bm_write_page has been
clobbered now.

Fix this by using struct completion, which does both the condition test
as well as the wake_up inside its spinlock, so this race cannot happen.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>