On Sun, Dec 7, 2014 at 4:31 PM, Filipe Manana <fdmanana@xxxxxxxx> wrote:
When we abort a transaction we iterate over all the ranges marked as
dirty
in fs_info->freed_extents[0] and fs_info->freed_extents[1], clear them
from those trees, add them back (unpin) to the free space caches and,
if
the fs was mounted with "-o discard", perform a discard on those
regions.
Also, after adding the regions to the free space caches, a fitrim
ioctl call
can see those ranges in a block group's free space cache and perform
a discard
on the ranges, so the same issue can happen without "-o discard" as
well.
This causes corruption, affecting one or multiple btree nodes (in the
worst
case leaving the fs unmountable) because some of those ranges (the
ones in
the fs_info->pinned_extents tree) correspond to btree nodes/leafs
that are
referred by the last committed super block - breaking the rule that
anything
that was committed by a transaction is untouched until the next
transaction
commits successfully.
This is great work Filipe, thank you!
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 7e2405a..ce65b0c 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -4134,12 +4134,6 @@ again:
if (ret)
break;
- /* opt_discard */
- if (btrfs_test_opt(root, DISCARD))
- ret = btrfs_error_discard_extent(root, start,
- end + 1 - start,
- NULL);
-
While you're here, can you please just delete
btrfs_error_discard_extent and use btrfs_discard_extent directly? It's
already being used in non-error cases, and since it only discards, I
don't see how we want to do that on errors anyway.
-chris
clear_extent_dirty(unpin, start, end, GFP_NOFS);
btrfs_error_unpin_extent_range(root, start, end);
cond_resched();
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index c2fc261..fc74a9b 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -5728,7 +5728,8 @@ void btrfs_prepare_extent_commit(struct
btrfs_trans_handle *trans,
update_global_block_rsv(fs_info);
}
-static int unpin_extent_range(struct btrfs_root *root, u64 start,
u64 end)
+static int unpin_extent_range(struct btrfs_root *root, u64 start,
u64 end,
+ const bool return_free_space)
{
struct btrfs_fs_info *fs_info = root->fs_info;
struct btrfs_block_group_cache *cache = NULL;
@@ -5752,7 +5753,8 @@ static int unpin_extent_range(struct btrfs_root
*root, u64 start, u64 end)
if (start < cache->last_byte_to_unpin) {
len = min(len, cache->last_byte_to_unpin - start);
- btrfs_add_free_space(cache, start, len);
+ if (return_free_space)
+ btrfs_add_free_space(cache, start, len);
}
start += len;
@@ -5816,7 +5818,7 @@ int btrfs_finish_extent_commit(struct
btrfs_trans_handle *trans,
end + 1 - start, NULL);
clear_extent_dirty(unpin, start, end, GFP_NOFS);
- unpin_extent_range(root, start, end);
+ unpin_extent_range(root, start, end, true);
cond_resched();
}
@@ -9705,7 +9707,7 @@ out:
int btrfs_error_unpin_extent_range(struct btrfs_root *root, u64
start, u64 end)
{
- return unpin_extent_range(root, start, end);
+ return unpin_extent_range(root, start, end, false);
}
int btrfs_error_discard_extent(struct btrfs_root *root, u64 bytenr,
--
2.1.3
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html