Btrfs: fix race between multi-task space allocation and caching space
authorMiao Xie <miaox@cn.fujitsu.com>
Fri, 9 Sep 2011 09:34:35 +0000 (17:34 +0800)
committerDavid Sterba <dsterba@suse.cz>
Thu, 20 Oct 2011 16:10:49 +0000 (18:10 +0200)
The task may fail to get free space though it is enough when multi-task
space allocation and caching space happen at the same time.

Task1 Caching Thread Task2
------------------------------------------------------------------------
find_free_extent
  The space has not
  be cached, and start
  caching thread. And
  wait for it.
cache space, if
the space is > 2MB
wake up Task1
find_free_extent
  get all the space that
  is cached.
  try to allocate space,
  but there is no space
  now.
trigger BUG_ON()

The message is following:
btrfs allocation failed flags 1, wanted 4096
space_info has 1040187392 free, is not full
space_info total=1082130432, used=4096, pinned=41938944, reserved=0, may_use=40828928, readonly=0
block group 12582912 has 8388608 bytes, 0 used 8388608 pinned 0 reserved
block group has cluster?: no
0 blocks of free space at or bigger than bytes is
block group 1103101952 has 1073741824 bytes, 4096 used 33550336 pinned 0 reserved
block group has cluster?: no
0 blocks of free space at or bigger than bytes is
------------[ cut here ]------------
kernel BUG at fs/btrfs/inode.c:835!
 [<ffffffffa031261b>] __extent_writepage+0x1bf/0x5ce [btrfs]
 [<ffffffff810cbcb8>] ? __set_page_dirty_nobuffers+0xfe/0x108
 [<ffffffffa02f8ada>] ? wait_current_trans+0x23/0xec [btrfs]
 [<ffffffff810c3fbf>] ? find_get_pages_tag+0x73/0xe2
 [<ffffffffa0312d12>] extent_write_cache_pages.clone.0+0x176/0x29a [btrfs]
 [<ffffffffa0312e74>] extent_writepages+0x3e/0x53 [btrfs]
 [<ffffffff8110ad2c>] ? do_sync_write+0xc6/0x103
 [<ffffffffa0302d6e>] ? btrfs_submit_direct+0x414/0x414 [btrfs]
 [<ffffffff811380fa>] ? fsnotify+0x236/0x266
 [<ffffffffa02fc930>] btrfs_writepages+0x22/0x24 [btrfs]
 [<ffffffff810cc215>] do_writepages+0x1c/0x25
 [<ffffffff810c4958>] __filemap_fdatawrite_range+0x4e/0x50
 [<ffffffff810c4982>] filemap_write_and_wait_range+0x28/0x51
 [<ffffffffa0306b2e>] btrfs_sync_file+0x7d/0x198 [btrfs]
 [<ffffffff8110aa26>] ? fsnotify_modify+0x5d/0x65
 [<ffffffff8112d150>] vfs_fsync_range+0x18/0x21
 [<ffffffff8112d170>] vfs_fsync+0x17/0x19
 [<ffffffff8112d316>] do_fsync+0x29/0x3e
 [<ffffffff8112d348>] sys_fsync+0xb/0xf
 [<ffffffff81468352>] system_call_fastpath+0x16/0x1b
[SNIP]
RIP  [<ffffffffa02fe08c>] cow_file_range+0x1c4/0x32b [btrfs]

We fix this bug by trying to allocate the space again if there are block groups
in caching.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
fs/btrfs/extent-tree.c

index 6cfcc90..cef355f 100644 (file)
@@ -4954,6 +4954,7 @@ static noinline int find_free_extent(struct btrfs_trans_handle *trans,
        bool failed_cluster_refill = false;
        bool failed_alloc = false;
        bool use_cluster = true;
+       bool have_caching_bg = false;
        u64 ideal_cache_percent = 0;
        u64 ideal_cache_offset = 0;
 
@@ -5036,6 +5037,7 @@ ideal_cache:
                }
        }
 search:
+       have_caching_bg = false;
        down_read(&space_info->groups_sem);
        list_for_each_entry(block_group, &space_info->block_groups[index],
                            list) {
@@ -5244,6 +5246,8 @@ refill_cluster:
                        failed_alloc = true;
                        goto have_block_group;
                } else if (!offset) {
+                       if (!cached)
+                               have_caching_bg = true;
                        goto loop;
                }
 checks:
@@ -5294,6 +5298,9 @@ loop:
        }
        up_read(&space_info->groups_sem);
 
+       if (!ins->objectid && loop >= LOOP_CACHING_WAIT && have_caching_bg)
+               goto search;
+
        if (!ins->objectid && ++index < BTRFS_NR_RAID_TYPES)
                goto search;