Re: [PATCH v2 34/34] ext4: rely on sb->f_bdev only

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 01, 2024 at 12:34:24PM +0100, Jan Kara wrote:
> On Tue 23-01-24 14:26:51, Christian Brauner wrote:
> > (1) Instead of bdev->bd_inode->i_mapping we do f_bdev->f_mapping
> > (2) Instead of bdev->bd_inode we could do f_bdev->f_inode
> > 
> > I mention this explicitly because (1) is dependent on how the block
> > device is opened while (2) isn't. Consider:
> > 
> > mount -t tmpfs tmpfs /mnt
> > mknod /mnt/foo b <minor> <major>
> > open("/mnt/foo", O_RDWR);
> > 
> > then (1) doesn't work because f_bdev->f_inode is a tmpfs inode _not_ the
> > actual bdev filesystem inode. But (2) is still the bd_inode->i_mapping
> > as that's set up during bdev_open().
> > 
> > IOW, I'm explicitly _not_ going via f_bdev->f_inode but via
> > f_bdev->f_mapping->host aka bdev_file_inode(f_bdev). Currently this
> > isn't a problem because sb->s_bdev_file stashes the a block device file
> > opened via bdev_open_by_*() which is always a file on the bdev
> > filesystem.
> > 
> > _If_ we ever wanted to allow userspace to pass a block device file
> > descriptor during superblock creation. Say:
> > 
> > fsconfig(fs_fd, FSCONFIG_CMD_CREATE_EXCL, "source", bdev_fd);
> > 
> > then using f_bdev->f_inode would be very wrong. Another thing to keep in
> > mind there would be that this would implicitly pin another filesystem.
> > Say:
> > 
> > mount -t ext4 /dev/sda /mnt
> > mknod /mnt/foo b <minor> <major>
> > bdev_fd = open("/mnt/foo", O_RDWR);
> > 
> > fd_fs = fsopen("xfs")
> > fsconfig(fd_fs, FSCONFIG_CMD_CREATE, "source", bdev_fd);
> > fd_mnt = fsmount(fd_fs);
> > move_mount(fd_mnt, "/mnt2");
> > 
> > umount /mnt # EBUSY
> > 
> > Because the xfs filesystem now pins the ext4 filesystem via the
> > bdev_file we're keeping. In other words, this is probably a bad idea and
> > if we allow userspace to do this then we should only use the provided fd
> > to lookup the block device and open our own handle to it.
> > 
> > Signed-off-by: Christian Brauner <brauner@xxxxxxxxxx>
> 
> I suppose this is more or less a sample how to get rid of sb->s_bdev /
> bd_inode dereferences AFAICT? Because otherwise I'm not sure why it was

Yes, correct. That was just a way to show that you don't need that
anymore.

> included in this series...

Yes, that would likely go separately.

> 
> > @@ -5576,7 +5576,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
> >  	 * used to detect the metadata async write error.
> >  	 */
> >  	spin_lock_init(&sbi->s_bdev_wb_lock);
> > -	errseq_check_and_advance(&sb->s_bdev->bd_inode->i_mapping->wb_err,
> > +	errseq_check_and_advance(&sb->s_bdev_file->f_mapping->wb_err,
> >  				 &sbi->s_bdev_wb_err);
> 
> So when we have struct file, it would be actually nicer to drop
> EXT4_SB(sb)->s_bdev_wb_err completely and instead use
> file_check_and_advance_wb_err(sb->s_bdev_file). But that's a separate
> cleanup I suppose.

Yep. I forgot about that helper.




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux