Re: [RFT 1/2] RISC-V: handle missing "no-map" properties for OpenSBI's PMP protected regions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 6, 2023 at 2:26 PM Conor Dooley <conor@xxxxxxxxxx> wrote:
>
> On Wed, Dec 06, 2023 at 04:52:11AM -0800, Palmer Dabbelt wrote:
> > On Thu, 10 Aug 2023 02:07:10 PDT (-0700), Conor Dooley wrote:
> > > On Wed, Aug 09, 2023 at 02:01:07AM -0700, Atish Kumar Patra wrote:
> > > > On Tue, Aug 8, 2023 at 6:39 AM Conor Dooley <conor@xxxxxxxxxx> wrote:
> > > > >
> > > > > On Tue, Aug 08, 2023 at 12:54:11AM -0700, Atish Kumar Patra wrote:
> > > > > > On Wed, Aug 2, 2023 at 4:14 AM Conor Dooley <conor.dooley@xxxxxxxxxxxxx> wrote:
> > > > > > >
> > > > > > > Add an erratum for versions [v0.8 to v1.3) of OpenSBI which fail to add
> > > > > > > the "no-map" property to the reserved memory nodes for the regions it
> > > > > > > has protected using PMPs.
> > > > > > >
> > > > > > > Our existing fix sweeping hibernation under the carpet by marking it
> > > > > > > NONPORTABLE is insufficient as there are other ways to generate
> > > > > > > accesses to these reserved memory regions, as Petr discovered [1]
> > > > > > > while testing crash kernels & kdump.
> > > > > > >
> > > > > > > Intercede during the boot process when the afflicted versions of OpenSBI
> > > > > > > are present & set the "no-map" property in all "mmode_resv" nodes before
> > > > > > > the kernel does its reserved memory region initialisation.
> > > > > > >
> > > > > >
> > > > > > We have different mechanisms of DT being passed to the kernel.
> > > > > >
> > > > > > 1. A prior stage(e.g U-Boot SPL) to M-mode runtime firmware (e.g.
> > > > > > OpenSBI, rustSBI) passes the DT to M-mode runtime firmware and it
> > > > > > passes to the next stage.
> > > > > > In this case, M-mode runtime firmware gets a chance to update the
> > > > > > no-map property in DT that the kernel can parse.
> > > > > >
> > > > > > 2. User loads the DT from the boot loader (e.g EDK2, U-Boot proper).
> > > > > > Any DT patching done by the M-mode firmware is useless. If these DTBs
> > > > > > don't have the no-map
> > > > > > property, hibernation or EFI booting will have issues as well.
> > > > > >
> > > > >
> > > > > > We are trying to solve only one part of problem #1 in this patch.
> > > > >
> > > > > Correct.
> > > > >
> > > > > If someone's second stage is also providing an incorrect devicetree
> > > > > then, yeah, this approach would fall apart - but it's the firmware
> > > > > provided devicetree being incorrect that I am trying to account for
> > > > > here. If a person incorrectly constructed one, I am not really sure what
> > > > > we can do for them, they incorrect described their hardware /shrug
> > > > > My patch should of course help in some of the scenarios you mention above
> > > > > if the name of the reserved memory region from OpenSBI is propagated by
> > > > > the second-stage bootloader, but that is just an extension of case 1,
> > > > > not case 2.
> > > > >
> > > > > > I
> > > > > > don't think any other M-mode runtime firmware patches DT with no-map
> > > > > > property as well.
> > > > > > Please let me know if I am wrong about that. The problem is not
> > > > > > restricted to [v0.8 to v1.3) of OpenSBI.
> > > > >
> > > > > It comes down to Alex's question - do we want to fix this kind of
> > > > > firmware issue in the kernel? Ultimately this is a policy decision that
> > > > > "somebody" has to make. Maybe the list of firmwares that need this
> > > >
> > > > IMO, we shouldn't as this is a slippery slope. Kernel can't fix every
> > > > firmware bug by having erratas.
> > > > I agree with your point below about firmware in shipping products. I
> > > > am not aware of any official products shipping anything other than
> > > > OpenSBI either.
> > >
> > > > However, I have seen users using other firmwares in their dev
> > > > environment.
> > >
> > > If someone's already changed their boards firmware, I have less sympathy
> > > for them, as they should be able to make further changes. Punters buying
> > > SBCs to install Fedora or Debian w/o having to consider their firmware
> > > are who I am more interested in helping.
> > >
> > > > IMHO, this approach sets a bad precedent for the future especially
> > > > when it only solves one part of the problem.
> > >
> > > Yeah, I'm certainly wary of setting an unwise precedent here.
> > > Inevitably we will need to have firmware-related errata and it'd be good
> > > to have a policy for what is (or more importantly what isn't
> > > acceptable). Certainly we have said that known-broken version of OpenSBI
> > > that T-Head puts in their SDK is not supported by the mainline kernel.
> > > On the latter part, I'm perfectly happy to expand the erratum to cover
> > > all affected firmwares, but I wasn't even sure if my fix worked
> > > properly, hence the request for testing from those who encountered the
> > > problem.
> > >
> > > > We shouldn't hide firmware bugs in the kernel when an upgraded
> > > > firmware is already available.
> > >
> > > Just to note, availability of an updated firmware upstream does not
> > > necessarily mean that corresponding update is possible for affected
> > > hardware.
> >
> > Yep.  I think we're been in a very hobbist-centric world in RISC-V land, but
> > in general trying to get people to update firmware is hard.  Part of the
> > whole "kernel updates don't break users" thing is what's underneath the
> > kernel, it's not just a uABI thing.
>
> Yeah, there's certainly an attitude that I think needs to go away, that
> updating firmware etc is something we can expect to be carried out on a
> universal basis. Or that fixing things in the upstream version of
> OpenSBI means it'll actually propagate down to system integrators.
>
> >
> > > > This bug is well documented in various threads and fixed in the latest
> > > > version of OpenSBI.
> > > > I am assuming other firmwares will follow it as well.
> > > >
> > > > Anybody facing hibernation or efi related booting issues should just
> > > > upgrade to the latest version of firmware (e.g OpenSBI v1.3)
> > > > Latest version of Qemu will support(if not happened already) the
> > > > latest version of OpenSBI.
> > > >
> > > > This issue will only manifest in kernels 6.4 or higher. Any user
> > > > facing these with the latest kernel can also upgrade the firmware.
> > > > Do you see any issue with that ?
> > >
> > > I don't think it is fair to compare the ease of upgrading the kernel
> > > to that required to upgrade a boards firmware, with the latter being
> > > far, far more inconvenient on pretty much all of the boards that I have.
> >
> > IMO we're in the same spot as every other port here, and generally they work
> > around firmware bugs when they've rolled out into production somewhere that
> > firmware updates aren't likely to happen quickly.  I'm not sure if there's
> > any sort of exact rules written down anywhere, but IMO if the bug is going
> > to impact users then we should deal with it.
> >
> > That applies for hardware bugs, but also firmware bugs (at a certain point
> > we won't be able to tell the difference).  We're sort of doing this with the
> > misaligned access handling, for example.
> >
> > > I'm perfectly happy to drop this series though, if people generally are
> > > of the opinion that this sort of firmware workaround is ill-advised.
> > > We are unaffected by it, so I certainly have no pressure to have
> > > something working here. It's my desire not to be user-hostile that
> > > motivated this patch.
> >
> > IIUC you guys and Reneas are the only ones who have hardware that might be
> > in a spot where users aren't able to update the firmware (ie, it's out in
> > production somewhere).
>
> I dunno if we can really keep thinking like that though. In terms of
> people who have devicetrees in the kernel and stuff available in western
> catalog distribution, sure.
> I don't think we can assume that that covers all users though, certainly
> the syntacore folks pop up every now and then, and I sure hope that
> Andes etc have larger customer bases than the in-kernel users would
> suggest.
>
> > So I'm adding Geert, though he probably saw this
> > months ago...
>
> Prabhakar might be a good call on that front. I'm not sure if the
> Renesas stuff works on affected versions of OpenSBI though, guess it
> depends on the sequencing of the support for the non-coherent stuff and
> when this bug was fixed.
>
ATM, I dont think there are any users who are using the upstream
kernel + OpenSBI (apart from me and Geert!). Currently the customers
are using the BSP releases.

Cheers,
Prabhakar

> > On that note: It's been ~4 months and it look like nobody's tested anything
> > (and the comments aren't really things that would preculde testing).
>
> Yeah, nobody seems to really have given a crap. I was hoping the
> StarFive guys that actually added the support for this would be
> interested in it, but alas they were not.
> I don't really care all that much - the platform I support is not
> affected by the problem and I just don't enable the option elsewhere.
>
> > So
> > maybe we just pick that second patch up into for-next and see what happens?
> > IIUC that will result in broken systems for users who haven't updated their
> > firmware.
> >
> > I agree that's a user-hostile way to do things, which is generally a bad way
> > to go, but if it's really true that there's no users then we're safe.
> > Probably also worth calling it out on sw-dev just to be safe.
>
> And if there are users, the fix is actually relatively straight-forward,
> just apply patch #1.
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@xxxxxxxxxxxxxxxxxxx
> http://lists.infradead.org/mailman/listinfo/linux-riscv





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux