On 03/07/2024 00:22, Charlie Jenkins wrote: > On Mon, Jul 01, 2024 at 04:20:15PM +0200, Clément Léger wrote: >> >> >> On 01/07/2024 15:58, Conor Dooley wrote: >>> On Mon, Jul 01, 2024 at 09:15:09AM +0200, Clément Léger wrote: >>>> >>>> >>>> On 27/06/2024 23:20, Charlie Jenkins wrote: >>>>> On Wed, Jun 26, 2024 at 03:39:14PM +0100, Conor Dooley wrote: >>>>>> On Mon, Jun 24, 2024 at 08:49:57PM -0400, Jesse Taube wrote: >>>>>>> Check for Zicclsm before checking for unaligned access speed. This will >>>>>>> greatly reduce the boot up time as finding the access speed is no longer >>>>>>> necessary. >>>>>>> >>>>>>> Signed-off-by: Jesse Taube <jesse@xxxxxxxxxxxx> >>>>>>> --- >>>>>>> V2 -> V3: >>>>>>> - New patch split from previous patch >>>>>>> --- >>>>>>> arch/riscv/kernel/unaligned_access_speed.c | 26 ++++++++++++++-------- >>>>>>> 1 file changed, 17 insertions(+), 9 deletions(-) >>>>>>> >>>>>>> diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel/unaligned_access_speed.c >>>>>>> index a9a6bcb02acf..329fd289b5c8 100644 >>>>>>> --- a/arch/riscv/kernel/unaligned_access_speed.c >>>>>>> +++ b/arch/riscv/kernel/unaligned_access_speed.c >>>>>>> @@ -259,23 +259,31 @@ static int check_unaligned_access_speed_all_cpus(void) >>>>>>> kfree(bufs); >>>>>>> return 0; >>>>>>> } >>>>>>> +#else /* CONFIG_RISCV_PROBE_UNALIGNED_ACCESS */ >>>>>>> +static int check_unaligned_access_speed_all_cpus(void) >>>>>>> +{ >>>>>>> + return 0; >>>>>>> +} >>>>>>> +#endif >>>>>>> >>>>>>> static int check_unaligned_access_all_cpus(void) >>>>>>> { >>>>>>> - bool all_cpus_emulated = check_unaligned_access_emulated_all_cpus(); >>>>>>> + bool all_cpus_emulated; >>>>>>> + int cpu; >>>>>>> + >>>>>>> + if (riscv_has_extension_unlikely(RISCV_ISA_EXT_ZICCLSM)) { >>>>>>> + for_each_online_cpu(cpu) { >>>>>>> + per_cpu(misaligned_access_speed, cpu) = RISCV_HWPROBE_MISALIGNED_FAST; >>>>>> >>>>>> - const: zicclsm >>>>>> description: >>>>>> The standard Zicclsm extension for misaligned support for all regular >>>>>> load and store instructions (including scalar and vector) but not AMOs >>>>>> or other specialized forms of memory access. Defined in the >>>>>> RISC-V RVA Profiles Specification. >>>>>> >>>>>> Doesn't, unfortunately, say anywhere there that they're actually fast :( >>>>> >>>>> Oh no! That is unfortunate that the ISA does not explicitly call that >>>>> out, but I think that acceptable. >>>>> >>>>> If a vendor puts Zicclsm in their isa string, they should expect >>>>> software to take advantage of misaligned accesses. FAST is our signal to >>>>> tell software that they should emit misaligned accesses. >>>> >>>> AFAIK, Zicclsm is not even an ISA extension, simply a profile >>>> specification which means that only the execution environment which >>>> provides the profile support misaligned accesses (cf >>>> https://lists.riscv.org/g/tech-profiles/message/56). >>> >>> I dunno, the specification status page used to describe it as an >>> extension: >>> https://wiki.riscv.org/display/HOME/Specification+Status+-+Historical >>> My understanding was that these could be considered extensions, just >>> like we are considering svade to be one. >>> >>>> . I don't think we >>>> can extrapolate that the misaligned accesses will be fast at all. >>> >>> That is my opinion on it too. If it doesn't say "fast" and give a >>> definition for what that means in the binding, then we can't assume that >>> it is fast. I'm also wary of extending definitions of extensions in the >>> binding, because a) I am 90% sure that people writing devicetrees don't >>> care and b) it'd be a potential difference between DT and ACPI without a >>> real justification (unlike the zkr or svade/svadu situations). >> >> BTW, the profile spec [1] has a note that states the following for Zicclsm: >> >> "Even though mandated, misaligned loads and stores might execute >> extremely slowly. Standard software distributions should assume their >> existence only for correctness, not for performance." >> >> Which was also quoted in patch 1, so I guess that settles it. > > The intention here was to allow vendors to configure an option to skip > the probing. This extension does not seem useful as it is written! A way > around this would be to add a kernel arg to set the access speed but > maybe it doesn't matter. For the sake of this patch, it looks like we > should get rid of this Zicclsm check. I think a parameter could be appropriate for vendors that want to skip the probing and gain a bit of time on boot time. Other options already exists to force specific settings so, why not ! Thanks, Clément > > - Charlie > >> >> Thanks, >> >> Clément >> >> Link: >> https://github.com/riscv/riscv-profiles/blob/main/src/profiles.adoc?plain=1#L524 >> [1] >> >>> >>>>> This allows for a generic kernel, like the one a distro would compile, to >>>>> skip the probing when booting on a system that explicitly called out >>>>> that the hardware supports misaligned accesses.