On Wed, Aug 28, 2019 at 03:01:16PM -0400, Josh Boyer wrote: > On Wed, Aug 28, 2019 at 2:40 PM Josef Bacik <josef@xxxxxxxxxxxxxx> wrote: > > > > On Wed, Aug 28, 2019 at 02:35:39PM -0400, Laura Abbott wrote: > > > On 8/28/19 1:58 PM, Josef Bacik wrote: > > > > On Tue, Aug 27, 2019 at 07:53:20AM -0400, Laura Abbott wrote: > > > > > On 8/26/19 11:39 PM, Neal Gompa wrote: > > > > > > On Mon, Aug 26, 2019 at 11:16 AM Laura Abbott <labbott@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > On 8/23/19 9:00 PM, Chris Murphy wrote: > > > > > > > > On Fri, Aug 23, 2019 at 1:17 PM Adam Williamson > > > > > > > > <adamwill@xxxxxxxxxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > So, there was recently a Thing where btrfs installs were broken, and > > > > > > > > > this got accepted as a release blocker: > > > > > > > > > > > > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1733388 > > > > > > > > > > > > > > > > Summary: This bug was introduced and discovered in linux-next, it > > > > > > > > started to affect Fedora 5.3.0-rc0 kernels in openqa tests, patch > > > > > > > > appeared during rc1, and the patch was merged into 5.3.0-rc2. The bug > > > > > > > > resulted in a somewhat transient deadlock which caused installs to > > > > > > > > hang, but no corruption. The fix, 2 files changed, 12 insertions, 8 > > > > > > > > deletions (1/2 the insertions are comments). > > > > > > > > > > > > > > > > How remarkable or interesting is this bug? And in particular, exactly > > > > > > > > how much faster should it have been fixed in order to avoid worrying > > > > > > > > about it being a blocker bug? > > > > > > > > > > > > > > > > 7/25 14:27 utc bug patch was submitted to linux-btrfs@ > > > > > > > > 7/25 22:33 utc bug was first reported in Fedora bugzilla > > > > > > > > 7/26 19:20 utc I confirmed upstream's patch related to this bug with > > > > > > > > upstream and updated the Fedora bug > > > > > > > > 7/26 22:50 utc I confirmed it was merged into rc2, and updated the Fedora bug > > > > > > > > > > > > > > > > So in the context of status quo, where Btrfs is presented as an option > > > > > > > > in the installer and if there are bugs they Beta blocking, how could > > > > > > > > or should this have been fixed sooner? What about the handling should > > > > > > > > have been different? > > > > > > > > > > > > > > > > > > > > > > That's a fair question. This bug actually represents how this _should_ > > > > > > > work. The concern is that in the past we haven't seen a lot engagement > > > > > > > in the past. Maybe today that has changed as demonstrated by this thread. > > > > > > > I'm still concerned about having this be a blocker vs. just keeping it > > > > > > > as an option, simply because a blocker stops the entire release and it > > > > > > > can be a last minute scramble to get things fixed. This was the ideal > > > > > > > case for a blocker bugs and I'm skeptical about all bugs going this well. > > > > > > > If we had a few more people who were willing to be on the btrfs alias and > > > > > > > do the work for blocker bugs it would be a much stronger case. > > > > > > > > > > > > > > > > > > > Out of curiosity, how many such issues have we had in the past 2 > > > > > > years? I personally can't recall any monumental occasions where people > > > > > > were scrambling over *Btrfs* in Fedora. If anything, we continue to > > > > > > inherit the work that SUSE and Facebook are doing upstream as part of > > > > > > us continually updating our kernels, which I'm grateful for. > > > > > > > > > > > > And in the instances where we've had such issues, has anyone reached > > > > > > out to btrfs folks in Fedora? Chris and myself are the current ones, > > > > > > but there have been others in the past. Both of us are subscribed to > > > > > > the linux-btrfs mailing list, and Chris has a decent rapport with most > > > > > > of the btrfs developers. > > > > > > > > > > > > What more do you want? Actual btrfs developers in Fedora? We don't > > > > > > have any for the majority of filesystems Fedora supports, only XFS. Is > > > > > > there some kind of problem with communicating with the upstream kernel > > > > > > developers about Fedora bugs that I'm not aware of? > > > > > > > > > > > > > > > > Again, it's about length of overall development. ext and XFS have > > > > > a much longer history in general which is something that's important > > > > > for file system stability in general. It's also a bit of a catch-22 > > > > > where the rate of btrfs use in Fedora is so low we don't actually > > > > > see issues. > > > > > > > > > > > > > I note here that ext2 and ext3 are offered as file systems in > > > > > > > > Custom/Advanced partitioning and in this sense have parity with Btrfs. > > > > > > > > If this same bug occurred in ext2 or ext3 would or should that cause > > > > > > > > discussion to drop them from the installer, even if the bug were fixed > > > > > > > > within 24 hours of discovery and patch? What about vfat? That's > > > > > > > > literally the only truly required filesystem that must work, for the > > > > > > > > most commonly supported hardware so it can't be dropped, we'd just be > > > > > > > > stuck until it got fixed. That work would have to be done upstream, > > > > > > > > yes? > > > > > > > > > > > > > > > > > > > > > > I don't think that's really a fair comparison. Just because options > > > > > > > are presented doesn't mean all of them are equal. ext2/ext3 and vfat > > > > > > > have been in development for much longer than btrfs and length of development > > > > > > > is something that's particularly important for file system stability > > > > > > > from talking with file system developers. It's not impossible for there > > > > > > > to be bugs in ext4 for example (we've certainly seen them before) but > > > > > > > btrfs is only now gaining overall stability and we're still more likely to see > > > > > > > bugs, especially with custom setups where people are likely to find > > > > > > > edge cases. > > > > > > > > > > > > > > > > > > > Nope. We can totally use this because LVM has not existed as long (we > > > > > > use LVM + filesystem by default, not plain partitions), and we still > > > > > > encounter quirks with things like thinp LVM combined with these > > > > > > filesystems. OverlayFS is mostly hot garbage (kernel people know it, > > > > > > container people know it, filesystem people know it, etc.), and yet we > > > > > > continue to try to use it in more places. Stratis is in an odd state > > > > > > of limbo now, since its main developer and advocate left Red Hat. > > > > > > > There are plenty of examples of Red Hat doing crazy/experimental > > > > > > things... I'd like to think Red Hat isn't supposed to be special here, > > > > > > but in this realm, it seems like it is... > > > > > > > > > > > > > > > > > > > > > > btrfs still doesn't give me the warm fuzzies and I also think this > > > > > is a bigger issue than other features simply because user data is at > > > > > stake. We do need to consider that the failure case is not "I can't do X" > > > > > but "my precious data which I have been trying to snapshot is now > > > > > inaccessible" in a way that's even worse than say rpm database > > > > > corruption. Even if it is in the advanced partitioning or not the > > > > > default, we can still end up with people clicking in because they > > > > > read an article about how btrfs was the hot new thing. > > > > > > > > > > There are two parts to this here: killing off btrfs entirely and > > > > > btrfs as release criteria. I think you are correct that there's > > > > > enough community support to justify keeping btrfs around at least > > > > > in the kernel (I can't speak for anaconda here) > > > > > > > > > > As for btrfs as release criteria, I'd feel much more confident > > > > > about that if we could have a file system developer on the btrfs > > > > > alias. I'm glad to hear the btrfs upstream community has been > > > > > receptive to bugs but it's still much easier to make things > > > > > happen if we have contributors who are active in the Fedora > > > > > community, especially if we want the advanced features that > > > > > btrfs has (which is why people want it anyway). So, who would > > > > > you suggest to work with us in Fedora? > > > > > > > > You can always CC me, if I get an email from you or anybody else I recognize > > > > from the fedora kernel team I'm going to pay attention to it. > > > > > > > > Facebook runs more btrfs file systems than Fedora has installs, so we're pretty > > > > happy with how it works stability wise. That being said we're slightly more > > > > fault tolerant than most users. If you guys are hitting problems chances are > > > > we'll hit them eventually as well, so it makes sense for us to be on top of > > > > them. > > > > > > > > I agree it would be better if somebody inside Fedora was able to help out, but > > > > again I'm only an email away. Thanks, > > > > > > > > > > So it appears you are on the btrfs alias already: > > > > > > fedora-kernel-btrfs: fs-maint@xxxxxxxxxx,josef@xxxxxxxxxxxxxx,bugzilla@xxxxxxxxxxxxxxxxx > > > > > > This technically meets the requirements if you are willing to stay on this > > > alias and (continue) to help with requests as needed. I would feel more > > > confident if we had a few more people involved as well. Even better > > > would be proactively going through the bugzillas to help find the > > > btrfs ones. > > > > Yeah that goes into a bucket that basically is ignored. The only time I'll peek > > in there is if somebody specifically pokes me, because generally speaking we hit > > the problems and fix them welllllll before Fedora users start to notice them. > > Fedora chugs along at the rate of daily upstream Linus snapshots. If > you're hitting and fixing issues before Fedora users see them, I'm > curious why Fedora users would ever see them. > > Where does the lag come from? Are the fixes queued internally? > Staged in an upstream subsystem tree? Is there a way for interested > btrfs people to proactively just get those fixed in Fedora before > users hit them? For this particular example we saw the problem in testing and had a patch on the mailinglist before you hit the problem. It was in a tree and sent to Linus, and was merged the day after the bugzilla was reported. So yes before users see them, unless they are subscribed to the daily snapshots, which I assume is just for testing right? Or were you guys going to ship 5.3-rc0? On one hand I understand all of the consternation around making btrfs bugs blockers for Fedora, but on the other hand it seems a bit silly to be having this conversation at all based on hitting a bug that went into the merge window and then was fixed before rc1 was even cut. Thanks, Josef _______________________________________________ kernel mailing list -- kernel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to kernel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/kernel@xxxxxxxxxxxxxxxxxxxxxxx