vmscan: don't accumulate scan pressure on unrelated lists
authorJohannes Weiner <hannes@saeurebad.de>
Sun, 19 Oct 2008 03:26:55 +0000 (20:26 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Mon, 20 Oct 2008 15:52:31 +0000 (08:52 -0700)
During each reclaim scan we accumulate scan pressure on unrelated lists
which will result in bogus scans and unwanted reclaims eventually.

Scanning lists with few reclaim candidates results in a lot of rotation
and therefor also disturbs the list balancing, putting even more
pressure on the wrong lists.

In a test-case with much streaming IO, and therefor a crowded inactive
file page list, swapping started because

  a) anon pages were reclaimed after swap_cluster_max reclaim
  invocations -- nr_scan of this list has just accumulated

  b) active file pages were scanned because *their* nr_scan has also
  accumulated through the same logic.  And this in return created a
  lot of rotation for file pages and resulted in a decrease of file
  list priority, again increasing the pressure on anon pages.

The result was an evicted working set of anon pages while there were
tons of inactive file pages that should have been taken instead.

Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/vmscan.c

index ca64e3e..412d787 100644 (file)
@@ -1413,16 +1413,13 @@ static unsigned long shrink_zone(int priority, struct zone *zone,
                if (scan_global_lru(sc)) {
                        int file = is_file_lru(l);
                        int scan;
-                       /*
-                        * Add one to nr_to_scan just to make sure that the
-                        * kernel will slowly sift through each list.
-                        */
+
                        scan = zone_page_state(zone, NR_LRU_BASE + l);
                        if (priority) {
                                scan >>= priority;
                                scan = (scan * percent[file]) / 100;
                        }
-                       zone->lru[l].nr_scan += scan + 1;
+                       zone->lru[l].nr_scan += scan;
                        nr[l] = zone->lru[l].nr_scan;
                        if (nr[l] >= sc->swap_cluster_max)
                                zone->lru[l].nr_scan = 0;