mm: protect set_page_dirty() from ongoing truncation
authorJohannes Weiner <hannes@cmpxchg.org>
Thu, 8 Jan 2015 22:32:18 +0000 (14:32 -0800)
committerBen Hutchings <ben@decadent.org.uk>
Fri, 20 Feb 2015 00:49:35 +0000 (00:49 +0000)
commit0330c992f554d28bd2d3b1973a825f520e7a3556
tree64b064053cb1668b1eaa021127bc6c28cc0531df
parent57b31943b128c88c591005f122005c033e5d6409
mm: protect set_page_dirty() from ongoing truncation

commit 2d6d7f98284648c5ed113fe22a132148950b140f upstream.

Tejun, while reviewing the code, spotted the following race condition
between the dirtying and truncation of a page:

__set_page_dirty_nobuffers()       __delete_from_page_cache()
  if (TestSetPageDirty(page))
                                     page->mapping = NULL
     if (PageDirty())
       dec_zone_page_state(page, NR_FILE_DIRTY);
       dec_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);
    if (page->mapping)
      account_page_dirtied(page)
        __inc_zone_page_state(page, NR_FILE_DIRTY);
__inc_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);

which results in an imbalance of NR_FILE_DIRTY and BDI_RECLAIMABLE.

Dirtiers usually lock out truncation, either by holding the page lock
directly, or in case of zap_pte_range(), by pinning the mapcount with
the page table lock held.  The notable exception to this rule, though,
is do_wp_page(), for which this race exists.  However, do_wp_page()
already waits for a locked page to unlock before setting the dirty bit,
in order to prevent a race where clear_page_dirty() misses the page bit
in the presence of dirty ptes.  Upgrade that wait to a fully locked
set_page_dirty() to also cover the situation explained above.

Afterwards, the code in set_page_dirty() dealing with a truncation race
is no longer needed.  Remove it.

Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.2:
 - Adjust context
 - Use VM_BUG_ON() rather than VM_BUG_ON_PAGE()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
include/linux/writeback.h
mm/memory.c
mm/page-writeback.c