memcg: remove incorrect underflow check
authorGreg Thelen <gthelen@google.com>
Fri, 1 Nov 2013 19:16:59 +0000 (12:16 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Fri, 1 Nov 2013 19:22:28 +0000 (12:22 -0700)
When a memcg is deleted mem_cgroup_reparent_charges() moves charged
memory to the parent memcg.  As of v3.11-9444-g3ea67d0 "memcg: add per
cgroup writeback pages accounting" there's bad pointer read.  The goal
was to check for counter underflow.  The counter is a per cpu counter
and there are two problems with the code:

 (1) per cpu access function isn't used, instead a naked pointer is used
     which easily causes oops.
 (2) the check doesn't sum all cpus

Test:
  $ cd /sys/fs/cgroup/memory
  $ mkdir x
  $ echo 3 > /proc/sys/vm/drop_caches
  $ (echo $BASHPID >> x/tasks && exec cat) &
  [1] 7154
  $ grep ^mapped x/memory.stat
  mapped_file 53248
  $ echo 7154 > tasks
  $ rmdir x
  <OOPS>

The fix is to remove the check.  It's currently dangerous and isn't
worth fixing it to use something expensive, such as
percpu_counter_sum(), for each reparented page.  __this_cpu_read() isn't
enough to fix this because there's no guarantees of the current cpus
count.  The only guarantees is that the sum of all per-cpu counter is >=
nr_pages.

Fixes: 3ea67d06e467 ("memcg: add per cgroup writeback pages accounting")
Reported-and-tested-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Greg Thelen <gthelen@google.com>
Reviewed-by: Sha Zhengju <handai.szj@taobao.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/memcontrol.c

index e632782..13b9d0f 100644 (file)
@@ -3782,7 +3782,6 @@ void mem_cgroup_move_account_page_stat(struct mem_cgroup *from,
 {
        /* Update stat data for mem_cgroup */
        preempt_disable();
-       WARN_ON_ONCE(from->stat->count[idx] < nr_pages);
        __this_cpu_sub(from->stat->count[idx], nr_pages);
        __this_cpu_add(to->stat->count[idx], nr_pages);
        preempt_enable();