On Fri, Apr 12, 2024 at 01:34:43PM -0700, Charlie Jenkins wrote: > On Fri, Apr 12, 2024 at 08:26:12PM +0100, Conor Dooley wrote: > > On Fri, Apr 12, 2024 at 11:46:21AM -0700, Charlie Jenkins wrote: > > > On Fri, Apr 12, 2024 at 07:38:04PM +0100, Conor Dooley wrote: > > > > On Fri, Apr 12, 2024 at 10:04:17AM -0700, Evan Green wrote: > > > > > On Fri, Apr 12, 2024 at 3:26 AM Conor Dooley <conor.dooley@xxxxxxxxxxxxx> wrote: > > > > > > > > > > > > On Thu, Apr 11, 2024 at 09:11:08PM -0700, Charlie Jenkins wrote: > > > > > > > The riscv_cpuinfo struct that contains mvendorid and marchid is not > > > > > > > populated until all harts are booted which happens after the DT parsing. > > > > > > > Use the vendorid/archid values from the DT if available or assume all > > > > > > > harts have the same values as the boot hart as a fallback. > > > > > > > > > > > > > > Fixes: d82f32202e0d ("RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs") > > > > > > > > > > > > If this is our only use case for getting the mvendorid/marchid stuff > > > > > > from dt, then I don't think we should add it. None of the devicetrees > > > > > > that the commit you're fixing here addresses will have these properties > > > > > > and if they did have them, they'd then also be new enough to hopefully > > > > > > not have "v" either - the issue is they're using whatever crap the > > > > > > vendor shipped. > > > > > > If we're gonna get the information from DT, we already have something > > > > > > that we can look at to perform the disable as the cpu compatibles give > > > > > > us enough information to make the decision. > > > > > > > > > > > > I also think that we could just cache the boot CPU's marchid/mvendorid, > > > > > > since we already have to look at it in riscv_fill_cpu_mfr_info(), avoid > > > > > > repeating these ecalls on all systems. > > > > > > > > > > > > Perhaps for now we could just look at the boot CPU alone? To my > > > > > > knowledge the systems that this targets all have homogeneous > > > > > > marchid/mvendorid values of 0x0. > > > > > > > > > > It's possible I'm misinterpreting, but is the suggestion to apply the > > > > > marchid/mvendorid we find on the boot CPU and assume it's the same on > > > > > all other CPUs? Since we're reporting the marchid/mvendorid/mimpid to > > > > > usermode in a per-hart way, it would be better IMO if we really do > > > > > query marchid/mvendorid/mimpid on each hart. The problem with applying > > > > > the boot CPU's value everywhere is if we're ever wrong in the future > > > > > (ie that assumption doesn't hold on some machine), we'll only find out > > > > > about it after the fact. Since we reported the wrong information to > > > > > usermode via hwprobe, it'll be an ugly userspace ABI issue to clean > > > > > up. > > > > > > > > You're misinterpreting, we do get the values on all individually as > > > > they're brought online. This is only used by the code that throws a bone > > > > to people with crappy vendor dtbs that put "v" in riscv,isa when they > > > > support the unratified version. > > > > > > Not quite, > > > > Remember that this patch stands in isolation and the justification given > > in your commit message does not mention anything other than fixing my > > broken patch. > > Fixing the patch in the simplest sense would be to eagerly get the > mvendorid/marchid without using the cached version. But this assumes > that all harts have the same mvendorid/marchid. This is not something > that I am strongly attached to. If it truly is detrimental to Linux to > allow a user a way to specify different vendorids for different harts > then I will remove that code. I think that the simple fix is all that we need to do here, perhaps updating the comment to point out how naive we are being. ` > > > > > the alternatives are patched before the other cpus are > > > booted, so the alternatives will have false positives resulting in > > > broken kernels. > > > > Over-eagerly disabling vector isn't going to break any kernels and > > really should not break a behaving userspace either. > > Under-eagerly disabling it (in a way that this approach could solve) is > > only going to happen on a system where the boot hart has non-zero values > > and claims support for v but a non-boot hart has zero values and > > claims support for v but actually doesn't implement the ratified version. > > If the boot hart doesn't support v, then we currently disable the > > extension as only homogeneous stuff is supported by Linux. If the boot > > hart claims support for "v" but doesn't actually implement the ratified > > version neither the intent of my original patch nor this fix for it are > > going to help avoid a broken kernel. > > > > I think we do have a problem if the boot cpu having some erratum leads > > to the kernel being patched in a way that does not work for the other > > CPUs on the system, but I don't think this series addresses that sort of > > issue at all as you'd be adding code to the pi section if you were fixing > > it. I also don't think we should be making pre-emptive changes to the > > errata patching code either to solve that sort of problem, until an SoC > > shows up where things don't work. > > Cheers, > > Conor. > >
Attachment:
signature.asc
Description: PGP signature