[RFC 1/1] ext4: Fix race in ext4_mb_discard_group_preallocations()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



There could be a race in function ext4_mb_discard_group_preallocations()
where the 1st thread may iterate through group's bb_prealloc_list and
remove all the PAs and add to function's local list head.
Now if the 2nd thread comes in to discard the group preallocations,
it will see that the group->bb_prealloc_list is empty and will return 0.

Consider for a case where we have less number of groups (for e.g. just group 0),
this may even return an -ENOSPC error from ext4_mb_new_blocks()
(where we call for ext4_mb_discard_group_preallocations()).
But that is wrong, since 2nd thread should have waited for 1st thread to release
all the PAs and should have retried for allocation. Since 1st thread
was anyway going to discard the PAs.

This patch fixes this race by checking if nothing could be collected
in the local list. This could mean that someone else might have freed
all of this group PAs for us. So simply return group->bb_free which
should also give us an upper bound on the total available space for
allocation in this group.

We need not check the fast path of whether the list (bb_prealloc_list)
is empty, since we are anyway in the slow path where the
ext4_mb_regular_allocator() has failed and hence we are now desperately trying
to discard all the group PAs to free up some space for allocation.

Suggested-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx>
Signed-off-by: Ritesh Harjani <riteshh@xxxxxxxxxxxxx>
---
 fs/ext4/mballoc.c | 31 ++++++++++++++++++++++++-------
 1 file changed, 24 insertions(+), 7 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 1027e01c9011..a6c92567ec14 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -3885,7 +3885,18 @@ ext4_mb_release_group_pa(struct ext4_buddy *e4b,
 }
 
 /*
- * releases all preallocations in given group
+ * This function discards all the preallocations for the given group.
+ * In this there could be a race with another process also trying to discard
+ * the PAs. So if we could not get any PA from grp->bb_prealloc_list into our
+ * local list, then we simply return grp->bb_free. This now will be an
+ * upper bound of the available space for that group. If we choose to return 0
+ * instead, then we may end up returning -ENOSPC error in the worst case
+ * scenario, which is certainly wrong.
+ *
+ * return value could be either of 2 below:-
+ *  - actual discarded PA blocks from grp->bb_prealloc_list.
+ *  - grp->bb_free value. (this will be 0 only when there are actually no free
+ *  block available to be allocated from this group anymore).
  *
  * first, we need to decide discard policy:
  * - when do we discard
@@ -3908,16 +3919,13 @@ ext4_mb_discard_group_preallocations(struct super_block *sb,
 
 	mb_debug(1, "discard preallocation for group %u\n", group);
 
-	if (list_empty(&grp->bb_prealloc_list))
-		return 0;
-
 	bitmap_bh = ext4_read_block_bitmap(sb, group);
 	if (IS_ERR(bitmap_bh)) {
 		err = PTR_ERR(bitmap_bh);
 		ext4_set_errno(sb, -err);
 		ext4_error(sb, "Error %d reading block bitmap for %u",
 			   err, group);
-		return 0;
+		goto out_dbg;
 	}
 
 	err = ext4_mb_load_buddy(sb, group, &e4b);
@@ -3925,7 +3933,7 @@ ext4_mb_discard_group_preallocations(struct super_block *sb,
 		ext4_warning(sb, "Error %d loading buddy information for %u",
 			     err, group);
 		put_bh(bitmap_bh);
-		return 0;
+		goto out_dbg;
 	}
 
 	if (needed == 0)
@@ -3967,9 +3975,15 @@ ext4_mb_discard_group_preallocations(struct super_block *sb,
 		goto repeat;
 	}
 
-	/* found anything to free? */
+	/*
+	 * If this list is empty, then return the grp->bb_free. As someone
+	 * else may have freed the PAs and updated grp->bb_free.
+	 */
 	if (list_empty(&list)) {
 		BUG_ON(free != 0);
+		mb_debug(1, "Someone may have freed PA for this group %u, grp->bb_free %d\n",
+			 group, grp->bb_free);
+		free = grp->bb_free;
 		goto out;
 	}
 
@@ -3994,6 +4008,9 @@ ext4_mb_discard_group_preallocations(struct super_block *sb,
 	ext4_unlock_group(sb, group);
 	ext4_mb_unload_buddy(&e4b);
 	put_bh(bitmap_bh);
+out_dbg:
+	mb_debug(1, "discarded (%d) blocks preallocated for group %u\n",
+		 free, group);
 	return free;
 }
 
-- 
2.21.0




[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux