Re: [RFC v4 linux-next 19/19] fs & block: remove bdev->bd_inode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 22, 2024 at 02:10:30PM +0100, Jan Kara wrote:
> > End result:
> > 
> > * old ->i_private leaked (already grabbed by your get_bdev_file())
> > * ->bd_openers at 1 (after your bdev_open() gets through)
> > * ->i_private left NULL.
> > 
> > Christoph, could we please get rid of that atomic_t nonsense?
> > It only confuses people into brainos like that.  It really
> > needs ->open_mutex for any kind of atomicity.
> 
> Well, there are a couple of places where we end up reading bd_openers
> without ->open_mutex. Sure these places are racy wrt other opens / closes
> so they need to be careful but we want to make sure we read back at least
> some sane value which is not guaranteed with normal int and compiler
> possily playing weird tricks when updating it. But yes, we could convert
> the atomic_t to using READ_ONCE + WRITE_ONCE in appropriate places to avoid
> these issues and make it more obvious bd_openers are not really handled in
> an atomic way.

What WRITE_ONE()?  We really shouldn't modify it without ->open_mutex; do
we ever do that?  In current mainline:

in blkdev_get_whole(), both callers under ->open_mutex:
block/bdev.c:671:       if (!atomic_read(&bdev->bd_openers))
block/bdev.c:675:       atomic_inc(&bdev->bd_openers);

in blkdev_put_whole(), the sole caller under ->open_mutex:
block_mutex/bdev.c:681:       if (atomic_dec_and_test(&bdev->bd_openers))

in blkdev_get_part(), both callers under ->open_mutex:
block/bdev.c:700:       if (!atomic_read(&part->bd_openers)) {
block/bdev.c:704:       atomic_inc(&part->bd_openers);

in blkdev_put_whole(), the sole caller under ->open_mutex:
block/bdev.c:741:       if (atomic_dec_and_test(&part->bd_openers)) {

in bdev_release(), a deliberately racy reader, commented as such:
block/bdev.c:1032:      if (atomic_read(&bdev->bd_openers) == 1)

in sync_bdevs(), under ->open_mutex:
block/bdev.c:1163:              if (!atomic_read(&bdev->bd_openers)) {

in bdev_del_partition(), under ->open_mutex:
block/partitions/core.c:460:    if (atomic_read(&part->bd_openers))

and finally, in disk_openers(), a racy reader:
include/linux/blkdev.h:231:     return atomic_read(&disk->part0->bd_openers);

So that's two READ_ONCE() and a bunch of reads and writes under ->open_mutex.
Callers of disk_openers() need to be careful and looking through those...
Some of them are under ->open_mutex (either explicitly, or as e.g. lo_release()
called only via bdev ->release(), which comes only under ->open_mutex), but
four of them are not:

arch/um/drivers/ubd_kern.c:1023:                if (disk_openers(ubd_dev->disk))
in ubd_remove().  Racy, possibly a bug.  AFAICS, it's accessible through UML
console and there's nothing to stop it from racing with open().

drivers/block/loop.c:1245:      if (disk_openers(lo->lo_disk) > 1) {
in loop_clr_fd().  Under loop's private lock, but that's likely to
be a race - ->bd_openers updates are not under that.  Note that
there's no ->open() for /dev/loop, BTW...

drivers/block/loop.c:2161:      if (lo->lo_state != Lo_unbound || disk_openers(lo->lo_disk) > 0) {
in loop_control_remove().  Similar to the previous one, except that
it's done out of band *and* it doesn't have the "autoclean" logics
to work around udev, lovingly described in the comment before the
call in loop_clr_fd().

drivers/block/nbd.c:1279:       if (disk_openers(nbd->disk) > 1)
in nbd_bdev_reset().  Under nbd private mutex (->config_lock),
so there's some exclusion with nbd_open(), but ->bd_openers change
comes outside of that.  Might or might not be a bug - I need to wake
up properly to look through that.




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux