tmpfs: distribute interleave better across nodes
authorNathan Zimmer <nzimmer@sgi.com>
Tue, 31 Jul 2012 23:46:17 +0000 (16:46 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Wed, 1 Aug 2012 01:42:50 +0000 (18:42 -0700)
When tmpfs has the interleave memory policy, it always starts allocating
for each file from node 0 at offset 0.  When there are many small files,
the lower nodes fill up disproportionately.

This patch spreads out node usage by starting files at nodes other than 0,
by using the inode number to bias the starting node for interleave.

Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/shmem.c

index c15b998..d4e184e 100644 (file)
@@ -929,7 +929,8 @@ static struct page *shmem_swapin(swp_entry_t swap, gfp_t gfp,
 
        /* Create a pseudo vma that just contains the policy */
        pvma.vm_start = 0;
-       pvma.vm_pgoff = index;
+       /* Bias interleave by inode number to distribute better across nodes */
+       pvma.vm_pgoff = index + info->vfs_inode.i_ino;
        pvma.vm_ops = NULL;
        pvma.vm_policy = spol;
        return swapin_readahead(swap, gfp, &pvma, 0);
@@ -942,7 +943,8 @@ static struct page *shmem_alloc_page(gfp_t gfp,
 
        /* Create a pseudo vma that just contains the policy */
        pvma.vm_start = 0;
-       pvma.vm_pgoff = index;
+       /* Bias interleave by inode number to distribute better across nodes */
+       pvma.vm_pgoff = index + info->vfs_inode.i_ino;
        pvma.vm_ops = NULL;
        pvma.vm_policy = mpol_shared_policy_lookup(&info->policy, index);