workqueue: fix possible stall on try_to_grab_pending() of a delayed work item
authorLai Jiangshan <laijs@cn.fujitsu.com>
Tue, 18 Sep 2012 17:40:00 +0000 (10:40 -0700)
committerBen Hutchings <ben@decadent.org.uk>
Wed, 17 Oct 2012 02:48:33 +0000 (03:48 +0100)
commit69db9d4382f1a90aa4343e7a4aa4ead72654ca8f
tree4985619bb31f4a9b79cbf7845103cbf66e98470b
parent606d8a8ad41c307aa763579e50f81c565c210f48
workqueue: fix possible stall on try_to_grab_pending() of a delayed work item

commit 3aa62497594430ea522050b75c033f71f2c60ee6 upstream.

Currently, when try_to_grab_pending() grabs a delayed work item, it
leaves its linked work items alone on the delayed_works.  The linked
work items are always NO_COLOR and will cause future
cwq_activate_first_delayed() increase cwq->nr_active incorrectly, and
may cause the whole cwq to stall.  For example,

state: cwq->max_active = 1, cwq->nr_active = 1
       one work in cwq->pool, many in cwq->delayed_works.

step1: try_to_grab_pending() removes a work item from delayed_works
       but leaves its NO_COLOR linked work items on it.

step2: Later on, cwq_activate_first_delayed() activates the linked
       work item increasing ->nr_active.

step3: cwq->nr_active = 1, but all activated work items of the cwq are
       NO_COLOR.  When they finish, cwq->nr_active will not be
       decreased due to NO_COLOR, and no further work items will be
       activated from cwq->delayed_works. the cwq stalls.

Fix it by ensuring the target work item is activated before stealing
PENDING in try_to_grab_pending().  This ensures that all the linked
work items are activated without incorrectly bumping cwq->nr_active.

tj: Updated comment and description.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
kernel/workqueue.c