Btrfs: fix race of using total_bytes_pinned
authorLiu Bo <bo.li.liu@oracle.com>
Wed, 2 Jul 2014 08:58:01 +0000 (16:58 +0800)
committerChris Mason <clm@fb.com>
Thu, 3 Jul 2014 14:04:15 +0000 (07:04 -0700)
This percpu counter @total_bytes_pinned is introduced to skip unnecessary
operations of 'commit transaction', it accounts for those space we may free
but are stuck in delayed refs.

And we zero out @space_info->total_bytes_pinned every transaction period so
we have a better idea of how much space we'll actually free up by committing
this transaction.  However, we do the 'zero out' part a little earlier, before
we actually unpin space, so we end up returning ENOSPC when we actually have
free space that's just unpinned from committing transaction.

xfstests/generic/074 complained then.

This fixes it by actually accounting the percpu pinned number when 'unpin',
and since it's protected by space_info->lock, the race is gone now.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
fs/btrfs/extent-tree.c

index 99c2539..813537f 100644 (file)
@@ -5678,7 +5678,6 @@ void btrfs_prepare_extent_commit(struct btrfs_trans_handle *trans,
        struct btrfs_caching_control *next;
        struct btrfs_caching_control *caching_ctl;
        struct btrfs_block_group_cache *cache;
-       struct btrfs_space_info *space_info;
 
        down_write(&fs_info->commit_root_sem);
 
@@ -5701,9 +5700,6 @@ void btrfs_prepare_extent_commit(struct btrfs_trans_handle *trans,
 
        up_write(&fs_info->commit_root_sem);
 
-       list_for_each_entry_rcu(space_info, &fs_info->space_info, list)
-               percpu_counter_set(&space_info->total_bytes_pinned, 0);
-
        update_global_block_rsv(fs_info);
 }
 
@@ -5741,6 +5737,7 @@ static int unpin_extent_range(struct btrfs_root *root, u64 start, u64 end)
                spin_lock(&cache->lock);
                cache->pinned -= len;
                space_info->bytes_pinned -= len;
+               percpu_counter_add(&space_info->total_bytes_pinned, -len);
                if (cache->ro) {
                        space_info->bytes_readonly += len;
                        readonly = true;