On Tue, Dec 07, 2021 at 05:52:40PM +0100, Hans de Goede wrote: > On 11/10/21 14:05, Hans de Goede wrote: > > On 11/10/21 09:45, Hans de Goede wrote: > >> On 11/9/21 23:07, Bjorn Helgaas wrote: > >>> On Sat, Nov 06, 2021 at 11:15:07AM +0100, Hans de Goede wrote: > >>>> On 10/20/21 23:14, Bjorn Helgaas wrote: > >>>>> On Wed, Oct 20, 2021 at 12:23:26PM +0200, Hans de Goede wrote: > >>>>>> On 10/19/21 23:52, Bjorn Helgaas wrote: > >>>>>>> On Thu, Oct 14, 2021 at 08:39:42PM +0200, Hans de Goede wrote: > >>>>>>>> Some BIOS-es contain a bug where they add addresses which map to system > >>>>>>>> RAM in the PCI host bridge window returned by the ACPI _CRS method, see > >>>>>>>> commit 4dc2287c1805 ("x86: avoid E820 regions when allocating address > >>>>>>>> space"). > >>>>>>>> > >>>>>>>> To work around this bug Linux excludes E820 reserved addresses when > >>>>>>>> allocating addresses from the PCI host bridge window since 2010. > >>>>>>>> ... > >>>>> > >>>>>>> I haven't seen anybody else eager to merge this, so I guess I'll stick > >>>>>>> my neck out here. > >>>>>>> > >>>>>>> I applied this to my for-linus branch for v5.15. > >>>>>> > >>>>>> Thank you, and sorry about the build-errors which the lkp > >>>>>> kernel-test-robot found. > >>>>>> > >>>>>> I've just send out a patch which fixes these build-errors > >>>>>> (verified with both .config-s from the lkp reports). > >>>>>> Feel free to squash this into the original patch (or keep > >>>>>> them separate, whatever works for you). > >>>>> > >>>>> Thanks, I squashed the fix in. > >>>>> > >>>>> HOWEVER, I think it would be fairly risky to push this into v5.15. > >>>>> We would be relying on the assumption that current machines have all > >>>>> fixed the BIOS defect that 4dc2287c1805 addressed, and we have little > >>>>> evidence for that. > >>>>> > >>>>> I'm not sure there's significant benefit to having this in v5.15. > >>>>> Yes, the mainline v5.15 kernel would work on the affected machines, > >>>>> but I suspect most people with those machines are running distro > >>>>> kernels, not mainline kernels. > >>>> > >>>> I understand that you were reluctant to add this to 5.15 so close > >>>> near the end of the 5.15 cycle, but can we please get this into > >>>> 5.16 now ? > >>>> > >>>> I know you ultimately want to see if there is a better fix, > >>>> but this is hitting a *lot* of users right now and if we come up > >>>> with a better fix we can always use that to replace this one > >>>> later. > >>> > >>> I don't know whether there's a "better" fix, but I do know that if we > >>> merge what we have right now, nobody will be looking for a better > >>> one. > >>> > >>> We're in the middle of the merge window, so the v5.16 development > >>> cycle is over. The v5.17 cycle is just starting, so we have time to > >>> hit that. Obviously a fix can be backported to older kernels as > >>> needed. > >>> > >>>> So can we please just go with this fix now, so that we can > >>>> fix the issues a lot of users are seeing caused by the current > >>>> *wrong* behavior of taking the e820 reservations into account ? > >>> > >>> I think the fix on the table is "ignore E820 for BIOS date >= 2018" > >>> plus the obvious parameters to force it both ways. > >> > >> Correct. > >> > >>> The thing I don't like is that this isn't connected at all to the > >>> actual BIOS defect. We have no indication that current BIOSes have > >>> fixed the defect, > >> > >> We also have no indication that that defect from 10 years ago, from > >> pre UEFI firmware is still present in modern day UEFI firmware which > >> is basically an entire different code-base. > >> > >> And even 10 years ago the problem was only happening to a single > >> family of laptop models (Dell Precision laptops) so this clearly > >> was a bug in that specific implementation and not some generic > >> issue which is likely to be carried forward. > >> > >>> and we have no assurance that future ones will not > >>> have the defect. It would be better if we had some algorithmic way of > >>> figuring out what to do. > >> > >> You yourself have said that in hindsight taking E820 reservations > >> into account for PCI bridge host windows was a mistake. So what > >> the "ignore E820 for BIOS date >= 2018" is doing is letting the > >> past be the past (without regressing on older models) while fixing > >> that mistake on any hardware going forward. > >> > >> In the unlikely case that we hit that BIOS bug again on 1 or 2 models, > >> we can simply DMI quirk those models, as we do for countless other > >> BIOS issues. > >> > >>> Thank you very much for chasing down the dmesg log archive > >>> (https://github.com/linuxhw/Dmesg; see > >>> https://lore.kernel.org/r/82035130-d810-9f0b-259e-61280de1d81f@xxxxxxxxxx). > >>> Unfortunately I haven't had time to look through it myself, and I > >>> haven't heard of anybody else doing it either. > >> > >> Right, I'm afraid that I already have spend way too much time on this > >> myself. Note that I've been working with users on this bug on and off > >> for over a year now. > >> > >> This is hitting many users and now that we have a viable fix, this > >> really needs to be fixed now. > >> > >> I believe that the "ignore E820 for BIOS date >= 2018" fix is good > >> enough and that you are letting perfect be the enemy of good here. > >> > >> As an upstream kernel maintainer myself, I'm sorry to say this, > >> but if we don't get some fix for this merged soon you are leaving > >> my no choice but to add my fix to the Fedora kernels as a downstream > >> patch (and to advise other distros to do the same). > >> > >> Note that if you are still afraid of regressions going the downstream > >> route is also an opportunity, Fedora will start testing moving users > >> to 5.15.y soon, so I could add the patch to Fedora's 5.15.y builds and > >> see how that goes ? > > > > So I've discussed this with the Fedora kernel maintainers and they have > > agreed to add the patch to the Fedora 5.15 kernels, which we will ask > > our users to start testing soon (we first run some voluntary testing > > before eventually moving all users over). > > > > This will provide us with valuable feedback wrt this patch causing > > regressions as you are worried about, or not. > > > > Assuming no regressions show up I hope that this will give you > > some assurance that there the patch causes no regressions and that > > you will then be willing to pick this up later during the 5.16 > > cycle so that Fedora only deviates from upstream for 1 cycle. > > 5.15.y kernels with this patch added have been in Fedora's > stable updates repo for a while now without any reports of the > regressions you feared this may cause. > > Bjorn, I hope that you are willing to merge this patch now that it has > seen some more wide spread testing ? I'm still not happy about the idea of basing this on BIOS dates. I did this with 7bc5e3f2be32 ("x86/PCI: use host bridge _CRS info by default on 2008 and newer machines"), and it was a mistake. Because of that mistake, we now have the use_crs/nocrs kernel parameters, which confuse users and lead to them being passed around as "fixes" on random bulletin boards. Adding another BIOS date check and use_e820/no_e820 kernel parameters feels like it's layering on more complexity to cover up another major mistake I made, 4dc2287c1805 ("x86: avoid E820 regions when allocating address space"). I think it would be better for the code to recognize the situation addressed by 4dc2287c1805 and deal with it directly. Is that possible? I dunno; I don't think we've really tried. Bjorn