On Wed, May 22, 2024 at 09:26:08AM +0200, Andrew Jones wrote: > On Tue, May 21, 2024 at 11:36:06AM GMT, Evan Green wrote: > > On Sat, May 18, 2024 at 9:00 AM Yangyu Chen <cyy@xxxxxxxxxxxx> wrote: > > > > > > The original documentation says hwprobe keys are bitmasks, but actually, > > > they are values. This patch clarifies this to avoid confusion. > > > > > > Signed-off-by: Yangyu Chen <cyy@xxxxxxxxxxxx> > > > > Hm, we also have this problem in the code, since > > hwprobe_key_is_bitmask() returns true for KEY_CPUPERF_0. This results > > in wrong information being returned for queries using the WHICH_CPU > > flag. If usermode asked for the set of CPUs that was specifically SLOW > > or EMULATED, the returned cpuset would also include cpus that were > > FAST. I believe all other queries are okay. > > > > The one-liner fix is to just not return true for that key in > > hwprobe_key_is_bitmask(). But that's technically user-visible: if some > > software relied on the buggy behavior of FAST cpus being swept up in > > the query for SLOW or EMULATED cpus, this change would expose that. > > The grownups-eat-their-vegetables thing to do would be to define a new > > key that returns this same value, but doesn't return true in > > hwprobe_key_is_bitmask(). What do people think? > > Even though I actually enjoy eating vegetables, I think it's unlikely > that we need to be so cautious for this. I feel like kernel updates > provide a bit of freedom to change results of hardware query syscalls, > even when run on the same hardware. Particularly the EMULATED query, > which I guess could change with a firmware update. And, even the SLOW > query could change if the probing was modified directly or indirectly. > IOW, applications that use the which-cpus syscall shouldn't freak out > if they don't get the same cpuset after a kernel update, which means > we can drop the FAST cpus from the result. Dropping FAST does not really address the problem here. These keys are being treated as a bitmap when they are in fact just values. Because they are currently treated as bitmaps, it is semantically acceptable to "or" two of the keys together and pass them into hwprobe. Dropping FAST from the input of EMULATED | SLOW will make it appear to work, but it is not reasonable to try to support or'ing of values that are not bitmaps. If a key that maps to a value of 5 is ever introduced, then the input of UNSUPPORTED | EMULATED will also return cpus that map to the new key. If hwprobe recieves a value of 3 for CPUPERF_0 and or'ing is supported, that could either be interpreted that the user is asking for cpus that are EMULATED or SLOW or FAST, or just FAST, or just EMULATED or SLOW. The current implementation defines that only FAST will be returned because or'ing values like this is nonsense. Allowing an input where keys are or'd together should not be supported in CPUPERF_0 since it is what is causing this issue. However, the original intent seems to be to allow an or'ing of keys. The verbosity of using CPUPERF_0 will be greatly increased without the ability to or keys together. If an application wants to know if misaligned accesses are supported, it currently needs 3 separate hwprobe calls, one for each EMULATED, SLOW, and FAST. Changing CPUPERF_0 to be a true bitmap would require changing the values of each of the existing keys which is not an acceptable API change. In my previous email I proposed a new key, which could be something like RISCV_HWPROBE_KEY_CPUPERF_1, that is actually a bitmap. That way a query of EMULATED | SLOW | FAST would be possible in a single hwprobe call, which seems to be the original intent of CPUPERF_0. - Charlie > > Thanks, > drew > > > > > -Evan > > > > > --- > > > Documentation/arch/riscv/hwprobe.rst | 31 ++++++++++++++++------------ > > > 1 file changed, 18 insertions(+), 13 deletions(-) > > > > > > diff --git a/Documentation/arch/riscv/hwprobe.rst b/Documentation/arch/riscv/hwprobe.rst > > > index 239be63f5089..4abfa3f9fe44 100644 > > > --- a/Documentation/arch/riscv/hwprobe.rst > > > +++ b/Documentation/arch/riscv/hwprobe.rst > > > @@ -188,25 +188,30 @@ The following keys are defined: > > > manual starting from commit 95cf1f9 ("Add changes requested by Ved > > > during signoff") > > > > > > -* :c:macro:`RISCV_HWPROBE_KEY_CPUPERF_0`: A bitmask that contains performance > > > +* :c:macro:`RISCV_HWPROBE_KEY_CPUPERF_0`: A value that contains performance > > > information about the selected set of processors. > > > > > > - * :c:macro:`RISCV_HWPROBE_MISALIGNED_UNKNOWN`: The performance of misaligned > > > - scalar accesses is unknown. > > > + * :c:macro:`RISCV_HWPROBE_MISALIGNED_MASK`: The bitmask of the misaligned > > > + access performance field in the value of key `RISCV_HWPROBE_KEY_CPUPERF_0`. > > > > > > - * :c:macro:`RISCV_HWPROBE_MISALIGNED_EMULATED`: Misaligned scalar accesses are > > > - emulated via software, either in or below the kernel. These accesses are > > > - always extremely slow. > > > + The following values (not bitmasks) in this field are defined: > > > > > > - * :c:macro:`RISCV_HWPROBE_MISALIGNED_SLOW`: Misaligned scalar accesses are > > > - slower than equivalent byte accesses. Misaligned accesses may be supported > > > - directly in hardware, or trapped and emulated by software. > > > + * :c:macro:`RISCV_HWPROBE_MISALIGNED_UNKNOWN`: The performance of misaligned > > > + scalar accesses is unknown. > > > > > > - * :c:macro:`RISCV_HWPROBE_MISALIGNED_FAST`: Misaligned scalar accesses are > > > - faster than equivalent byte accesses. > > > + * :c:macro:`RISCV_HWPROBE_MISALIGNED_EMULATED`: Misaligned scalar accesses are > > > + emulated via software, either in or below the kernel. These accesses are > > > + always extremely slow. > > > > > > - * :c:macro:`RISCV_HWPROBE_MISALIGNED_UNSUPPORTED`: Misaligned scalar accesses > > > - are not supported at all and will generate a misaligned address fault. > > > + * :c:macro:`RISCV_HWPROBE_MISALIGNED_SLOW`: Misaligned scalar accesses are > > > + slower than equivalent byte accesses. Misaligned accesses may be supported > > > + directly in hardware, or trapped and emulated by software. > > > + > > > + * :c:macro:`RISCV_HWPROBE_MISALIGNED_FAST`: Misaligned scalar accesses are > > > + faster than equivalent byte accesses. > > > + > > > + * :c:macro:`RISCV_HWPROBE_MISALIGNED_UNSUPPORTED`: Misaligned scalar accesses > > > + are not supported at all and will generate a misaligned address fault. > > > > > > * :c:macro:`RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE`: An unsigned int which > > > represents the size of the Zicboz block in bytes. > > > -- > > > 2.43.0 > > >