On Wed, Aug 09, 2023 at 02:01:07AM -0700, Atish Kumar Patra wrote: > On Tue, Aug 8, 2023 at 6:39 AM Conor Dooley <conor@xxxxxxxxxx> wrote: > > > > On Tue, Aug 08, 2023 at 12:54:11AM -0700, Atish Kumar Patra wrote: > > > On Wed, Aug 2, 2023 at 4:14 AM Conor Dooley <conor.dooley@xxxxxxxxxxxxx> wrote: > > > > > > > > Add an erratum for versions [v0.8 to v1.3) of OpenSBI which fail to add > > > > the "no-map" property to the reserved memory nodes for the regions it > > > > has protected using PMPs. > > > > > > > > Our existing fix sweeping hibernation under the carpet by marking it > > > > NONPORTABLE is insufficient as there are other ways to generate > > > > accesses to these reserved memory regions, as Petr discovered [1] > > > > while testing crash kernels & kdump. > > > > > > > > Intercede during the boot process when the afflicted versions of OpenSBI > > > > are present & set the "no-map" property in all "mmode_resv" nodes before > > > > the kernel does its reserved memory region initialisation. > > > > > > > > > > We have different mechanisms of DT being passed to the kernel. > > > > > > 1. A prior stage(e.g U-Boot SPL) to M-mode runtime firmware (e.g. > > > OpenSBI, rustSBI) passes the DT to M-mode runtime firmware and it > > > passes to the next stage. > > > In this case, M-mode runtime firmware gets a chance to update the > > > no-map property in DT that the kernel can parse. > > > > > > 2. User loads the DT from the boot loader (e.g EDK2, U-Boot proper). > > > Any DT patching done by the M-mode firmware is useless. If these DTBs > > > don't have the no-map > > > property, hibernation or EFI booting will have issues as well. > > > > > > > > We are trying to solve only one part of problem #1 in this patch. > > > > Correct. > > > > If someone's second stage is also providing an incorrect devicetree > > then, yeah, this approach would fall apart - but it's the firmware > > provided devicetree being incorrect that I am trying to account for > > here. If a person incorrectly constructed one, I am not really sure what > > we can do for them, they incorrect described their hardware /shrug > > My patch should of course help in some of the scenarios you mention above > > if the name of the reserved memory region from OpenSBI is propagated by > > the second-stage bootloader, but that is just an extension of case 1, > > not case 2. > > > > > I > > > don't think any other M-mode runtime firmware patches DT with no-map > > > property as well. > > > Please let me know if I am wrong about that. The problem is not > > > restricted to [v0.8 to v1.3) of OpenSBI. > > > > It comes down to Alex's question - do we want to fix this kind of > > firmware issue in the kernel? Ultimately this is a policy decision that > > "somebody" has to make. Maybe the list of firmwares that need this > > IMO, we shouldn't as this is a slippery slope. Kernel can't fix every > firmware bug by having erratas. > I agree with your point below about firmware in shipping products. I > am not aware of any official products shipping anything other than > OpenSBI either. > However, I have seen users using other firmwares in their dev > environment. If someone's already changed their boards firmware, I have less sympathy for them, as they should be able to make further changes. Punters buying SBCs to install Fedora or Debian w/o having to consider their firmware are who I am more interested in helping. > IMHO, this approach sets a bad precedent for the future especially > when it only solves one part of the problem. Yeah, I'm certainly wary of setting an unwise precedent here. Inevitably we will need to have firmware-related errata and it'd be good to have a policy for what is (or more importantly what isn't acceptable). Certainly we have said that known-broken version of OpenSBI that T-Head puts in their SDK is not supported by the mainline kernel. On the latter part, I'm perfectly happy to expand the erratum to cover all affected firmwares, but I wasn't even sure if my fix worked properly, hence the request for testing from those who encountered the problem. > We shouldn't hide firmware bugs in the kernel when an upgraded > firmware is already available. Just to note, availability of an updated firmware upstream does not necessarily mean that corresponding update is possible for affected hardware. > This bug is well documented in various threads and fixed in the latest > version of OpenSBI. > I am assuming other firmwares will follow it as well. > > Anybody facing hibernation or efi related booting issues should just > upgrade to the latest version of firmware (e.g OpenSBI v1.3) > Latest version of Qemu will support(if not happened already) the > latest version of OpenSBI. > > This issue will only manifest in kernels 6.4 or higher. Any user > facing these with the latest kernel can also upgrade the firmware. > Do you see any issue with that ? I don't think it is fair to compare the ease of upgrading the kernel to that required to upgrade a boards firmware, with the latter being far, far more inconvenient on pretty much all of the boards that I have. I'm perfectly happy to drop this series though, if people generally are of the opinion that this sort of firmware workaround is ill-advised. We are unaffected by it, so I certainly have no pressure to have something working here. It's my desire not to be user-hostile that motivated this patch.
Attachment:
signature.asc
Description: PGP signature