On Thu, Jul 06, 2023 at 02:19:14PM -0700, Darrick J. Wong wrote: > TBH, so long as bcachefs is the only user of sixlocks and mean/variance, > I don't really care what's in them, though they probably ought to live > in fs/bcachefs/ until a second user arises. I've been waiting for Linus to weigh in on those (and the rest of the merge) since he had opinions a few weeks ago, but I have no real objection there. I'd need to add an export for osq_lock, that's all. > > - d_mark_tmpfile(): trivial new helper, from pulling out part of > > d_tmpfile(). We need this because bcachefs handles the nlink count > > for tmpfiles earlier, in the btree transaction. > > XFS might want this too, we also handle the nlink count for tmpfiles > earlier, in a transaction, and end up playing stupid games with the > refcount to fit the vfs function: > > if (tmpfile) { > /* > * The VFS requires that any inode fed to d_tmpfile must > * have nlink == 1 so that it can decrement the nlink in > * d_tmpfile. However, we created the temp file with > * nlink == 0 because we're not allowed to put an inode > * with nlink > 0 on the unlinked list. Therefore we > * have to set nlink to 1 so that d_tmpfile can > * immediately set it back to zero. > */ > set_nlink(inode, 1); > d_tmpfile(tmpfile, inode); > } Yeah, that same game would technically work for bcachefs - but I'm hoping we can just do the right thing here :) > > - Don't block on s_umount from __invalidate_super: this is a bugfix > > for a deadlock in generic/042 because of how we use sget(), the > > commit message goes into more detail. > > If this is in reference to the earlier subthread about some io_uring > thing causing unmount to hang -- my impressions of that were that yes, > it's a bug, but no, it's not a bug in bcachefs itself. I also wondered > why (a) that hadn't split out as its own thread; and (b) is this really > a bcachefs blocker? No, this is completely unrelated. The io_uring thing hits on generic/388 (and others) and just causes umount to fail with -EBUSY. This one is an actual deadlock and it hits every time in generic/042. It's specific to the loopback device and when it emits certain events, and it hits every time so I really do need this fix included. > /me shrugs, been on vacation and in hospitals for the last month or so. > > > bcachefs doesn't use sget() for mutual exclusion because a) sget() > > is insane, what we really want is the _block device_ to be opened > > exclusively (which we do), and we have our own block device opening > > path - which we need to, as we're a multi device filesystem. > > ...and isn't jan kara already messing with this anyway? The blkdev_get_handle() patchset? I like that, but I don't think that's related - if there's something more related to sget() I haven't seen it yet > OTOH there's so many layers of tiny helper functions and macros that I > have a really hard time making sense of all those pre-bcachefs changes. > That's why I haven't weighed in. Given all the weird problems we've had > recently with new code colliding badly with under-documented optimized > core code, I'm fearful of touching anything. ??? not sure what you're referring to here, are there specific patches or recent issues you're thinking of? I don't think any of the non-fs/bcachefs/ patches are remotely tricky enough for any of this to be a concern. > > You, the btrfs developers, got started when Linux filesystem teams were > > quite a bit bigger than they are now: I was at Google when Google had a > > bunch of people working on ext4, and that was when ZFS had recently come > > out and there was recognition that Linux needed an answer to ZFS and you > > were able to ride that excitement. It's been a bit harder for me to get > > something equally ambitions going, to be honest. > > > > But years ago when I realized I was onto something, I decided this > > project was only going to fail if I let it fail - so I'm in it for the > > long haul. > > > > Right now what I'm hearing, in particular from Redhat, is that they want > > it upstream in order to commit more resources. Which, I know, is not > > what kernel people want to hear, but it's the chicken-and-the-egg > > situation I'm in. > > /me suspects mainline merging is necessary but not sufficient -- few > non-developers want to deal with merging an out of tree filesystem, but > that still doesn't mean anyone will commit real engineering resources. Yeah, no doubt it will continue to be an uphill battle. But it's a necessary step in the right direction, for sure. > > > I am really, really wanting you to succeed here Kent. If the general consensus > > > is you need to have some idiot review fs/bcachefs I will happily carve out some > > > time and dig in. > > > > That would be much appreciated - I'll owe you some beers next time I see > > you. But before jumping in, let's see if we can get people who have > > already worked with the code to say something. > > > > Something I've done in the past that might be helpful - instead (or in > > addition to) having people read code in isolation, perhaps we could do a > > group call/meeting where people can ask questions about the code, bring > > up design issues they've seen in other filesystems, etc. - I've also > > found that kind of setup great for identifying places in the code where > > additional documentation would be useful. > > "At this point I think Kent's QA efforts are at least as good as XFS's, > just merge it and let's move on to the next thing." high praise :)