On Wed, Mar 05, 2025 at 10:48:34AM -0800, Darrick J. Wong wrote: > On Wed, Mar 05, 2025 at 09:04:50AM -0800, Luis Chamberlain wrote: > > On Wed, Mar 05, 2025 at 02:15:47PM +0000, Matthew Wilcox wrote: > > > On Tue, Mar 04, 2025 at 10:33:30PM -0800, Darrick J. Wong wrote: > > > > > So this is expedient because XFS happens to not call sb_set_blocksize()? > > > > > What is the path forward for filesystems which call sb_set_blocksize() > > > > > today and want to support LBS in future? > > > > > > > > Well they /could/ set sb_blocksize/sb_blocksize_bits themselves, like > > > > XFS does. > > > > > > I'm kind of hoping that isn't the answer. > > > > set_blocksize() can be used. The only extra steps the filesystem needs > > to in addition is: > > > > sb->s_blocksize = size; > > sb->s_blocksize_bits = blksize_bits(size); > > > > Which is what both XFS and bcachefs do. > > > > We could modify sb to add an LBS flag but that alone would not suffice > > either as the upper limit is still a filesystem specific limit. Additionally > > it also does not suffice for filesystems that support a different device > > for metadata writes, for instance XFS supports this and uses the sector > > size for set_blocksize(). > > > > So I think that if ext4 for example wants to use LBS then simply it > > would open code the above two lines and use set_blocksize(). Let me know > > if you have any other recommendations. > > int sb_set_large_blocksize(struct super_block *sb, int size) > { > if (set_blocksize(sb->s_bdev_file, size)) > return 0; > sb->s_blocksize = size; > sb->s_blocksize_bits = blksize_bits(size); > return sb->s_blocksize; > } > EXPORT_SYMBOL_GPL(sb_set_large_blocksize); > > int sb_set_blocksize(struct super_block *sb, int size) > { > if (size > PAGE_SIZE) > return 0; > return sb_set_large_blocksize(sb, size); > } > EXPORT_SYMBOL(sb_set_blocksize); > > Though you'll note that this doesn't help XFS, or any other filesystem > where the bdev block size isn't set to the fs block size. But xfs can > just be weird on its own like always. ;) Actually, I failed to also notice XFS implicitly calls sb_set_large_blocksize() through: xfs_fs_get_tree() --> get_tree_bdev() --> get_tree_bdev_flags() --> get_tree_bdev_flags() --> sb_set_blocksize() We just don't care if sb_set_blocksize() if fails inside get_tree_bdev_flags(). To be clear this has been the case since we added and tested LBS on XFS. So if we wanted something more automatic, how about: diff --git a/block/bdev.c b/block/bdev.c index 3bd948e6438d..4844d1e27b6f 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -181,6 +181,8 @@ EXPORT_SYMBOL(set_blocksize); int sb_set_blocksize(struct super_block *sb, int size) { + if (!(sb->s_type->fs_flags & FS_LBS) && size > PAGE_SIZE) + return 0; if (set_blocksize(sb->s_bdev_file, size)) return 0; /* If we get here, we know size is validated */ diff --git a/fs/bcachefs/fs.c b/fs/bcachefs/fs.c index 90ade8f648d9..e99e378d68ea 100644 --- a/fs/bcachefs/fs.c +++ b/fs/bcachefs/fs.c @@ -2396,7 +2396,7 @@ static struct file_system_type bcache_fs_type = { .name = "bcachefs", .init_fs_context = bch2_init_fs_context, .kill_sb = bch2_kill_sb, - .fs_flags = FS_REQUIRES_DEV | FS_ALLOW_IDMAP, + .fs_flags = FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_LBS, }; MODULE_ALIAS_FS("bcachefs"); diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index d92d7a07ea89..3d8b80165d48 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -2118,7 +2118,8 @@ static struct file_system_type xfs_fs_type = { .init_fs_context = xfs_init_fs_context, .parameters = xfs_fs_parameters, .kill_sb = xfs_kill_sb, - .fs_flags = FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_MGTIME, + .fs_flags = FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_MGTIME | + FS_LBS, }; MODULE_ALIAS_FS("xfs"); diff --git a/include/linux/fs.h b/include/linux/fs.h index 2c3b2f8a621f..16d17bd82be0 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2616,6 +2616,7 @@ struct file_system_type { #define FS_DISALLOW_NOTIFY_PERM 16 /* Disable fanotify permission events */ #define FS_ALLOW_IDMAP 32 /* FS has been updated to handle vfs idmappings. */ #define FS_MGTIME 64 /* FS uses multigrain timestamps */ +#define FS_LBS 128 /* FS supports LBS */ #define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */ int (*init_fs_context)(struct fs_context *); const struct fs_parameter_spec *parameters;