Re: [PATCH v2] Btrfs: fix fs corruption on transaction abort if device supports discard

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Dec 7, 2014 at 4:31 PM, Filipe Manana <fdmanana@xxxxxxxx> wrote:
When we abort a transaction we iterate over all the ranges marked as dirty
in fs_info->freed_extents[0] and fs_info->freed_extents[1], clear them
from those trees, add them back (unpin) to the free space caches and, if the fs was mounted with "-o discard", perform a discard on those regions. Also, after adding the regions to the free space caches, a fitrim ioctl call can see those ranges in a block group's free space cache and perform a discard on the ranges, so the same issue can happen without "-o discard" as well.

This causes corruption, affecting one or multiple btree nodes (in the worst case leaving the fs unmountable) because some of those ranges (the ones in the fs_info->pinned_extents tree) correspond to btree nodes/leafs that are referred by the last committed super block - breaking the rule that anything that was committed by a transaction is untouched until the next transaction
commits successfully.

This is great work Filipe, thank you!


diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 7e2405a..ce65b0c 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -4134,12 +4134,6 @@ again:
 		if (ret)
 			break;

-		/* opt_discard */
-		if (btrfs_test_opt(root, DISCARD))
-			ret = btrfs_error_discard_extent(root, start,
-							 end + 1 - start,
-							 NULL);
-

While you're here, can you please just delete btrfs_error_discard_extent and use btrfs_discard_extent directly? It's already being used in non-error cases, and since it only discards, I don't see how we want to do that on errors anyway.

-chris


 		clear_extent_dirty(unpin, start, end, GFP_NOFS);
 		btrfs_error_unpin_extent_range(root, start, end);
 		cond_resched();
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index c2fc261..fc74a9b 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -5728,7 +5728,8 @@ void btrfs_prepare_extent_commit(struct btrfs_trans_handle *trans,
 	update_global_block_rsv(fs_info);
 }

-static int unpin_extent_range(struct btrfs_root *root, u64 start, u64 end) +static int unpin_extent_range(struct btrfs_root *root, u64 start, u64 end,
+			      const bool return_free_space)
 {
 	struct btrfs_fs_info *fs_info = root->fs_info;
 	struct btrfs_block_group_cache *cache = NULL;
@@ -5752,7 +5753,8 @@ static int unpin_extent_range(struct btrfs_root *root, u64 start, u64 end)

 		if (start < cache->last_byte_to_unpin) {
 			len = min(len, cache->last_byte_to_unpin - start);
-			btrfs_add_free_space(cache, start, len);
+			if (return_free_space)
+				btrfs_add_free_space(cache, start, len);
 		}

 		start += len;
@@ -5816,7 +5818,7 @@ int btrfs_finish_extent_commit(struct btrfs_trans_handle *trans,
 						   end + 1 - start, NULL);

 		clear_extent_dirty(unpin, start, end, GFP_NOFS);
-		unpin_extent_range(root, start, end);
+		unpin_extent_range(root, start, end, true);
 		cond_resched();
 	}

@@ -9705,7 +9707,7 @@ out:

int btrfs_error_unpin_extent_range(struct btrfs_root *root, u64 start, u64 end)
 {
-	return unpin_extent_range(root, start, end);
+	return unpin_extent_range(root, start, end, false);
 }

 int btrfs_error_discard_extent(struct btrfs_root *root, u64 bytenr,
--
2.1.3


--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]