On Fri, Jul 07, 2023 at 03:13:06PM +0200, Jan Kara wrote: > On Thu 06-07-23 18:43:14, Kent Overstreet wrote: > > On Thu, Jul 06, 2023 at 02:19:14PM -0700, Darrick J. Wong wrote: > > > /me shrugs, been on vacation and in hospitals for the last month or so. > > > > > > > bcachefs doesn't use sget() for mutual exclusion because a) sget() > > > > is insane, what we really want is the _block device_ to be opened > > > > exclusively (which we do), and we have our own block device opening > > > > path - which we need to, as we're a multi device filesystem. > > > > > > ...and isn't jan kara already messing with this anyway? > > > > The blkdev_get_handle() patchset? I like that, but I don't think that's > > related - if there's something more related to sget() I haven't seen it > > yet > > There's a series on top of that that also modifies how sget() works [1]. > Christian wants that bit to be merged separately from the bdev handle stuff > and Christoph chimed in with some other related cleanups so he'll now take > care of that change. > > Anyhow we should have sget() that does not exclusively claim the bdev > unless it needs to create a new superblock soon. Thanks for the link sget() felt a bit odd in bcachefs because we have our own bch2_fs_open() path that, completely separately from the VFS opens a list of block devices and returns a fully constructed filesystem handle. We need this because it's also used in userspace, where we don't have the VFS and it wouldn't make much sense to lift sget(), for e.g. fsck and other tools. IOW, we really do need to own the whole codepath that opens the actual block devices; our block device open path does things like parse the opts struct to decide whether to open the block device in write mode or exclusive mode... So the way around this in bcachefs is we call sget() twice - first in "find an existing sb but don't create one" mode, then if that fails we call bch2_fs_open() and call sget() again in "create a super_block and attach it to this bch_fs" - a bit awkward but it works. Not sure if this has come up in other filesystems, but here's the relevant bcachefs code: https://evilpiepirate.org/git/bcachefs.git/tree/fs/bcachefs/fs.c#n1756