Re: [PATCH v2 01/34] bdev: open block device as files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 29, 2024 at 05:02:41PM +0100, Christoph Hellwig wrote:
> > +static unsigned blk_to_file_flags(blk_mode_t mode)
> > +{
> > +	unsigned int flags = 0;
> > +
> 
> ...
> 
> > +	/*
> > +	 * O_EXCL is one of those flags that the VFS clears once it's done with
> > +	 * the operation. So don't raise it here either.
> > +	 */
> > +	if (mode & BLK_OPEN_NDELAY)
> > +		flags |= O_NDELAY;
> 
> O_EXCL isn't dealt with in this helper at all.

Yeah, on purpose was my point bc we can just rely on @holder and passing
_EXCL without holder is invalid. But I could add it.

> 
> > +	/*
> > +	 * If BLK_OPEN_WRITE_IOCTL is set then this is a historical quirk
> > +	 * associated with the floppy driver where it has allowed ioctls if the
> > +	 * file was opened for writing, but does not allow reads or writes.
> > +	 * Make sure that this quirk is reflected in @f_flags.
> > +	 */
> > +	if (mode & BLK_OPEN_WRITE_IOCTL)
> > +		flags |= O_RDWR | O_WRONLY;
> 
> .. and BLK_OPEN_WRITE_IOCTL will never be passed to it.  It only comes
> from open block devices nodes.
> 
> That being said, passing BLK_OPEN_* to bdev_file_open_by_* actually
> feels wrong.  They deal with files and should just take normal
> O_* flags instead of translating from BLK_OPEN_* to O_* back to
> BLK_OPEN_* for the driver (where they make sense as the driver
> flags are pretty different from what is passed to open).
> 
> Now of course changing that would make a mess of the whole series,
> so maybe that can go into a new patch at the end?

Yes, I had considered that and it would work but there's the issue that
we need to figure out how to handle BLK_OPEN_RESTRICT_WRITES. It has no
corresponding O_* flag that would let us indicate this. So I had
considered:

1/ Expose bdev_file_open_excl() so callers don't need to pass any
   specific flags. Nearly all filesystems would effectively use this
   helper as sb_open_mode() adds it implicitly. That would have the
   side-effect of introducing another open helper ofc; possibly two if
   we take _by_dev() and _by_path() into account.

2/ Abuse an O_* flag to mean BLK_OPEN_RESTRICT_WRITES. For example,
   O_TRUNC or O_NOCTTY which is pretty yucky.

3/ Introduce an internal O_* flag which is also ugly. Vomitorious and my
   co-maintainers would likely chop off my hands so I can't go near a
   computer again.

3/ Make O_EXCL when passed together with bdev_file_open_by_*() always
   imply BLK_OPEN_RESTRICT_WRITES.

The 3/ option would probably be the cleanest one and I think that all
filesystems now pass at least a holder and holder ops so this _should_
work.

Thoughts?

> 
> > + * @noaccount: whether this is an internal open that shouldn't be counted
> >   */
> >  static struct file *alloc_file(const struct path *path, int flags,
> > -		const struct file_operations *fop)
> > +		const struct file_operations *fop, bool noaccount)
> 
> Just a suggestion as you are the maintainer here, but I always find
> it hard to follow when infrastructure in subsystem A is changed in
> a patch primarily changing subsystem B.  Can the file_table.c
> changes go into a separate patch or patches with commit logs
> documenting their semantics?
> 
> And while we're at the semantics I find this area already a bit of a
> a mess and this doesn't make it any better..
> 
> How about the following:
> 
>  - alloc_file loses the actual file allocation and gets a new name
>    (unfortunatel init_file is already taken), callers call
>    alloc_empty_file_noaccount or alloc_empty_file plus the
>    new helper.
>  - similarly __alloc_file_pseudo is split into a helper creating
>    a path for mnt and inode, and callers call that plus the
>    file allocation
> 
> ?

Ok, let me see how far I get.

> 
> > +extern struct file *alloc_file_pseudo_noaccount(struct inode *, struct vfsmount *,
> 
> no need for the extern here.
> 
> > +	struct block_device	*s_bdev;	/* can go away once we use an accessor for @s_bdev_file */
> 
> can you put the comment into a separate line to make it readable.
> 
> But I'm not even sure it should go away.  s_bdev is used all over the
> data and metadata I/O path, so caching it and avoiding multiple levels
> of pointer chasing would seem useful.

Fair.




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux