On Mon, Aug 28, 2023 at 09:44:55AM -0700, Evan Green wrote: > On Sat, Aug 26, 2023 at 1:01 AM Andrew Jones <ajones@xxxxxxxxxxxxxxxx> wrote: > > > > On Fri, Aug 25, 2023 at 03:51:28PM -0700, Evan Green wrote: > > > On Fri, Aug 25, 2023 at 1:16 AM Andrew Jones <ajones@xxxxxxxxxxxxxxxx> wrote: > > > > > > > > On Thu, Aug 24, 2023 at 03:06:53PM -0700, Evan Green wrote: > > > > > On Thu, Aug 24, 2023 at 10:29 AM Conor Dooley <conor@xxxxxxxxxx> wrote: > > > > ... > > > > > > Do you want to have this new thing in cpuinfo tell the user "this hart > > > > > > has xyz extensions that are supported by a kernel, but maybe not this > > > > > > kernel" or to tell the user "this hart has xyz extensions that are > > > > > > supported by this kernel"? Your text above says "understood by the > > > > > > kernel", but I think that's a poor definition that needs to be improved > > > > > > to spell out exactly what you mean. IOW does "understood" mean the > > > > > > kernel will parse them into a structure, or does it mean "yes you can > > > > > > use this extension on this particular hart". > > > > > > > > > > I'm imagining /proc/cpuinfo being closer to "the CPU has it and the > > > > > kernel at least vaguely understands it, but may not have full support > > > > > for it enabled". I'm assuming /proc/cpuinfo is mostly used by 1) > > > > > humans wanting to know if they have hardware support for a feature, > > > > > and 2) administrators collecting telemetry to manage fleets (ie do I > > > > > have any hardware deployed that supports X). > > > > > > > > Is (2) a special case of (1)? (I want to make sure I understand all the > > > > cases.) > > > > > > More or less, yes. In bucket two are also folks wondering things like > > > "are all these crash reports I'm getting specific to machines with X". > > > > > > > > > > > > Programmers looking to > > > > > see "is the kernel support for this feature ready right now" would > > > > > ideally not be parsing /proc/cpuinfo text, as more direct mechanisms > > > > > like specific hwprobe bits for "am I fully ready to go" would be > > > > > easier to work with. Feel free to yell at me if this overall vision > > > > > seems flawed. > > > > > > > > > > I tried to look to see if there was consensus among the other > > > > > architectures. Aarch64 seems to go with "supported and fully enabled", > > > > > as their cpu_has_feature() directly tests elf_hwcap, and elements in > > > > > arm64_elf_hwcaps[] are Kconfig gated. X86 is complicated, but IIRC is > > > > > more along the lines of "hardware has it". They have two macros, > > > > > cpu_has() for "raw capability" and cpu_feature_enabled() for "kernel > > > > > can do it too", and they use cpu_has() for /proc/cpuinfo flags. > > > > > > > > > > > > > I'd lean more towards the way AArch64 goes, because, unless /proc/cpuinfo > > > > is just a blind regurgitation of an isa string from DT / ACPI, then the > > > > kernel must at least know something about it. Advertising a feature which > > > > is known, but also known not to work, seems odd to me. So my vote is that > > > > only features which are present and enabled in the kernel or present and > > > > not necessary to be enabled in the kernel in order for userspace or > > > > virtual machines to use be advertised in /proc/cpuinfo. > > > > > > > > We still have SMBIOS (dmidecode) to blindly dump what the hardware > > > > supports for cases (1) and (2) above. > > > > > > Yeah, there's an argument to be made for that. My worry is it's a > > > difficult line to hold. The bar you're really trying to describe (or > > > at least what people might take away from it) is "if it's listed here > > > then it's fully ready to be used in userspace". But then things get > > > squishy when there are additional ways to control the use of the > > > feature (prctls() in init to turn it on, usermode policy to turn it > > > off, security doodads that disable it, etc). I'm assuming nobody wants > > > a version of /proc/cpuinfo that changes depending on which process is > > > asking. So then the line would have to be more carefully described as > > > "well, the hardware can do it, and the kernel COULD do it under some > > > circumstances, but YMMV", which ends up falling somewhat short of the > > > original goal. In my mind keeping /proc/cpuinfo as close to "here's > > > what the hardware can do" seems like a more defensible position. > > > -Evan > > > > I agree with that. I was actually even trying to say the same thing, > > but only by bringing up virtual machines. Once we decide we'll expose > > extensions to VMs, whether or not the host kernel enables them, then > > none of the other host kernel configurations matter with respect to > > advertising the feature, since the guest kernel may have a completely > > different set of configurations. > > My head spins a little when I try to picture a feature which 1) > requires kernel support to use, 2) has that kernel support turned off > in the host kernel, but 3) is passed down into guest kernels. Mine did too, but apparently these already exist for kvm guests. I can't find the exact email, but either Drew or Anup told me that Svpbmt can be used by a guest even if support for it is not present in the host kernel. Thanks, Conor. > Generally though, I agree that trying to tie the guarantees of > features in /proc/cpuinfo too much to software gets confusing when > viewed through the double lens of virtualization. > > > > > So I think we should only be filtering out extensions that are disabled > > because they're broken (have a detected erratum), have been "hidden" > > (have a kernel command line allowing them to be treated as if not > > present), or cannot be used at all due to missing accompanying hardware > > descriptions (such as block size info for CBO extensions). In all cases, > > I presume we'd wire checks up in riscv_isa_extension_check() and no > > checks would be gated on Kconfigs or anything else. And, since > > /proc/cpuinfo gets its list from the bitmap that's already filtered by > > riscv_isa_extension_check(), then, long story short, we're good to go :-) > > > > But maybe we can try to spell that policy out a bit more in > > Documentation/riscv/uabi.rst. > > Right, that sounds reasonable to me, and is consistent with the > behavior we already have. With this documentation change I was only > trying to define the lower bound, rather than the complete definition > for every case. In other words, seeing a feature in cpuinfo guarantees > only that the hardware (or virtualized hardware) supports the feature, > but that's all the language says. So for instance NOT seeing a feature > in cpuinfo doesn't necessarily mean the hardware doesn't support it. > Software turning it off for the reasons you describe IMO doesn't > contradict what's written here. I was planning to leave that tacit, > but if you have suggestions on how to spell that out I'd take them. > > -Evan
Attachment:
signature.asc
Description: PGP signature