mm/hotplug: correctly add new zone to all other nodes' zone lists
authorJiang Liu <jiang.liu@huawei.com>
Tue, 31 Jul 2012 23:43:30 +0000 (16:43 -0700)
committerBen Hutchings <ben@decadent.org.uk>
Wed, 20 Mar 2013 15:03:39 +0000 (15:03 +0000)
commit 08dff7b7d629807dbb1f398c68dd9cd58dd657a1 upstream.

When online_pages() is called to add new memory to an empty zone, it
rebuilds all zone lists by calling build_all_zonelists().  But there's a
bug which prevents the new zone to be added to other nodes' zone lists.

online_pages() {
build_all_zonelists()
.....
node_set_state(zone_to_nid(zone), N_HIGH_MEMORY)
}

Here the node of the zone is put into N_HIGH_MEMORY state after calling
build_all_zonelists(), but build_all_zonelists() only adds zones from
nodes in N_HIGH_MEMORY state to the fallback zone lists.
build_all_zonelists()

    ->__build_all_zonelists()
->build_zonelists()
    ->find_next_best_node()
->for_each_node_state(n, N_HIGH_MEMORY)

So memory in the new zone will never be used by other nodes, and it may
cause strange behavor when system is under memory pressure.  So put node
into N_HIGH_MEMORY state before calling build_all_zonelists().

Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Signed-off-by: Jiang Liu <liuj97@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Keping Chen <chenkeping@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
mm/memory_hotplug.c

index 9ad7d1e..09d87b7 100644 (file)
@@ -515,19 +515,20 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages)
 
        zone->present_pages += onlined_pages;
        zone->zone_pgdat->node_present_pages += onlined_pages;
-       if (need_zonelists_rebuild)
-               build_all_zonelists(zone);
-       else
-               zone_pcp_update(zone);
+       if (onlined_pages) {
+               node_set_state(zone_to_nid(zone), N_HIGH_MEMORY);
+               if (need_zonelists_rebuild)
+                       build_all_zonelists(zone);
+               else
+                       zone_pcp_update(zone);
+       }
 
        mutex_unlock(&zonelists_mutex);
 
        init_per_zone_wmark_min();
 
-       if (onlined_pages) {
+       if (onlined_pages)
                kswapd_run(zone_to_nid(zone));
-               node_set_state(zone_to_nid(zone), N_HIGH_MEMORY);
-       }
 
        vm_total_pages = nr_free_pagecache_pages();