perf: Fix race in removing an event
authorPeter Zijlstra <peterz@infradead.org>
Fri, 2 May 2014 14:56:01 +0000 (16:56 +0200)
committerBen Hutchings <ben@decadent.org.uk>
Fri, 11 Jul 2014 12:33:56 +0000 (13:33 +0100)
commit09b0c6269f9683d39d43e4ba0c1cfdcd04f40c11
tree1fe44842397ef9b79cd4175098d7147f29074767
parentf935bf46b0748d78d5d8c9725312aee3443f19d0
perf: Fix race in removing an event

commit 46ce0fe97a6be7532ce6126bb26ce89fed81528c upstream.

When removing a (sibling) event we do:

raw_spin_lock_irq(&ctx->lock);
perf_group_detach(event);
raw_spin_unlock_irq(&ctx->lock);

<hole>

perf_remove_from_context(event);
raw_spin_lock_irq(&ctx->lock);
...
raw_spin_unlock_irq(&ctx->lock);

Now, assuming the event is a sibling, it will be 'unreachable' for
things like ctx_sched_out() because that iterates the
groups->siblings, and we just unhooked the sibling.

So, if during <hole> we get ctx_sched_out(), it will miss the event
and not call event_sched_out() on it, leaving it programmed on the
PMU.

The subsequent perf_remove_from_context() call will find the ctx is
inactive and only call list_del_event() to remove the event from all
other lists.

Hereafter we can proceed to free the event; while still programmed!

Close this hole by moving perf_group_detach() inside the same
ctx->lock region(s) perf_remove_from_context() has.

The condition on inherited events only in __perf_event_exit_task() is
likely complete crap because non-inherited events are part of groups
too and we're tearing down just the same. But leave that for another
patch.

Most-likely-Fixes: e03a9a55b4e ("perf: Change close() semantics for group events")
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Much-staring-at-traces-by: Vince Weaver <vincent.weaver@maine.edu>
Much-staring-at-traces-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140505093124.GN17778@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.2: drop change in perf_pmu_migrate_context()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
kernel/events/core.c