On 2022/4/2 00:39, David Sterba wrote:
On Fri, Apr 01, 2022 at 03:29:37PM +0800, Qu Wenruo wrote:
[BUG]
For a 4K sector sized btrfs with v1 cache enabled and only mounted on
systems with 4K page size, if it's mounted on subpage (64K page size)
systems, it can cause the following warning on v1 space cache:
BTRFS error (device dm-1): csum mismatch on free space cache
BTRFS warning (device dm-1): failed to load free space cache for block group 84082688, rebuilding it now
Although not a big deal, as kernel can rebuild it without problem, such
warning will bother end users, especially if they want to switch the
same btrfs seamlessly between different page sized systems.
[CAUSE]
V1 free space cache is still using fixed PAGE_SIZE for various bitmap,
like BITS_PER_BITMAP.
Such hard-coded PAGE_SIZE usage will cause various mismatch, from v1
cache size to checksum.
Thus kernel will always reject v1 cache with a different PAGE_SIZE with
csum mismatch.
[FIX]
Although we should fix v1 cache, it's already going to be marked
deprecated soon.
And we have v2 cache based on metadata (which is already fully subpage
compatible), and it has almost everything superior than v1 cache.
So just force subpage mount to use v2 cache on mount.
Reported-by: Matt Corallo <blnxfsl@xxxxxxxxxxx>
CC: stable@xxxxxxxxxxxxxxx # 5.15+
Link: https://lore.kernel.org/linux-btrfs/61aa27d1-30fc-c1a9-f0f4-9df544395ec3@xxxxxxxxxxx/
Signed-off-by: Qu Wenruo <wqu@xxxxxxxx>
---
fs/btrfs/disk-io.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index d456f426924c..34eb6d4b904a 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3675,6 +3675,17 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
if (sectorsize < PAGE_SIZE) {
struct btrfs_subpage_info *subpage_info;
+ /*
+ * V1 space cache has some hardcoded PAGE_SIZE usage, and is
+ * going to be deprecated.
+ *
+ * Force to use v2 cache for subpage case.
+ */
+ btrfs_clear_opt(fs_info->mount_opt, SPACE_CACHE);
+ btrfs_set_and_info(fs_info, FREE_SPACE_TREE,
+ "forcing free space tree for sector size %u with page size %lu",
+ sectorsize, PAGE_SIZE);
I'm not sure this is implemented the right way. Why is it unconditional?
Isn't the same thing we do when parsing the mount options for switch
v1->v2 cache?
Does any subsequent mount have to clear and set the bits after it has
been already? Or what if the free space tree is set at mkfs time, which
is now the default.
The function btrfs_set_and_info() will only inform the end users if the
bit is not set.
Next, remounting v1->v2 does more things, like removing the v1 tree if
it exists. And due to some bugs there are more bits for free space tree
to be set like FREE_SPACE_TREE_VALID. So I don't thing this patch
covers all cases for the v2.
You're right on remounting, but in the opposite way.
There is nothing prevent the users from re-enabling v1 cache.
I should also prevent user from setting v1 cache.
Another concern is, I didn't see code cleaning up v1 cache when we do
the v1->v2 switch.
The only code doing such cleanup is cleanup_free_space_cache_v1() in
free-space-cache.c, but it only gets called in
btrfs_set_free_space_cache_v1_active().
Or did I miss something?
Thanks,
Qu