On Fri, 1 May 2020 at 15:50, Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx> wrote: > > On Fri, May 01, 2020 at 03:10:59PM +0200, Ard Biesheuvel wrote: > > [...] > > > > >> If we silently fix things up, then people will continue to write broken > > > >> tables without even realising, new OSes will have to implement the same > > > >> mechanism because vendors will have little interest in changing things > > > >> that have worked "correctly" with Linux for years, and we've effectively > > > >> achieved a de-facto redefinition of the spec. Making our end of the > > > >> interface robust is obviously desirable, but there still needs to be > > > >> *some* incentive for the folks on the other end to get it right. > > > >> > > > > > > > > Agreed. But at least we'll be able to detect it and flag it in the > > > > general case, rather than having a special case for D05/06 only > > > > (although I suppose splitting the output ranges like those platforms > > > > do is rather unusual) > > > > > > Yup, in principle the fixed quirk list gives a nice reassuring sense of > > > "we'll work around these early platforms and everyone from now on will > > > get it right", but whether reality plays out that way is another matter > > > entirely... > > > > The reason I am looking into this is that I think the fix should go to > > stable, given that the current situation makes it impossible to write > > firmware that works with older and newer kernels. > > Yes. If we do remove the quirk the sooner we do it the better to > reduce the stable patches. > > > Lorenzo said he wouldn't mind taking the current version with ACPI OEM > > ID matching back to -stable, but it's another quirk list to manage, > > which I would prefer to avoid. > > > > But I don't care deeply either way, to be honest, as long as we can > > get something backported so compliant firmware is not being penalized > > anymore. > > Question: if we remove the iort_workaround_oem_info stuff but we *do* > keep the existing apply_id_count_workaround flag and we set it by going > through all the mappings at boot time and detect if any of these > off-by-one conditions apply - would the resulting code be any simpler ? > > The global flag would apply (as it does now) to _all_ mappings but it is > very likely that if the off-by-one firmware bug is present it applies to > the IORT table as a whole rather than a single mapping entry. > This particular issue is based on a misinterpretation, so I agree that it makes sense to have a global flag, as long as we only set it if the mappings are fully consistent in every other respect, or we'll run the risk of hitting issues like the one Robin describes, where things happen to work, but will fail once we apply the heuristic. Such an issue could exist on one end of the table, while we could spot the off-by-one issue somewhere else. Which brings us back to a point I made earlier: do we really want to validate the table and ensure that it is fully internally consistent? Or do we want to be robust in the face of a single known issue that we helped create? So in my opinion, just fixing it up when we run into it is fine. I can add the extra sanity check to reduce the potential fallout for other broken systems, but beyond that, I think we shouldn't do too much.