On Mon, Jul 01, 2024 at 04:20:15PM +0200, Clément Léger wrote: > > > On 01/07/2024 15:58, Conor Dooley wrote: > > On Mon, Jul 01, 2024 at 09:15:09AM +0200, Clément Léger wrote: > >> > >> > >> On 27/06/2024 23:20, Charlie Jenkins wrote: > >>> On Wed, Jun 26, 2024 at 03:39:14PM +0100, Conor Dooley wrote: > >>>> On Mon, Jun 24, 2024 at 08:49:57PM -0400, Jesse Taube wrote: > >>>>> Check for Zicclsm before checking for unaligned access speed. This will > >>>>> greatly reduce the boot up time as finding the access speed is no longer > >>>>> necessary. > >>>>> > >>>>> Signed-off-by: Jesse Taube <jesse@xxxxxxxxxxxx> > >>>>> --- > >>>>> V2 -> V3: > >>>>> - New patch split from previous patch > >>>>> --- > >>>>> arch/riscv/kernel/unaligned_access_speed.c | 26 ++++++++++++++-------- > >>>>> 1 file changed, 17 insertions(+), 9 deletions(-) > >>>>> > >>>>> diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel/unaligned_access_speed.c > >>>>> index a9a6bcb02acf..329fd289b5c8 100644 > >>>>> --- a/arch/riscv/kernel/unaligned_access_speed.c > >>>>> +++ b/arch/riscv/kernel/unaligned_access_speed.c > >>>>> @@ -259,23 +259,31 @@ static int check_unaligned_access_speed_all_cpus(void) > >>>>> kfree(bufs); > >>>>> return 0; > >>>>> } > >>>>> +#else /* CONFIG_RISCV_PROBE_UNALIGNED_ACCESS */ > >>>>> +static int check_unaligned_access_speed_all_cpus(void) > >>>>> +{ > >>>>> + return 0; > >>>>> +} > >>>>> +#endif > >>>>> > >>>>> static int check_unaligned_access_all_cpus(void) > >>>>> { > >>>>> - bool all_cpus_emulated = check_unaligned_access_emulated_all_cpus(); > >>>>> + bool all_cpus_emulated; > >>>>> + int cpu; > >>>>> + > >>>>> + if (riscv_has_extension_unlikely(RISCV_ISA_EXT_ZICCLSM)) { > >>>>> + for_each_online_cpu(cpu) { > >>>>> + per_cpu(misaligned_access_speed, cpu) = RISCV_HWPROBE_MISALIGNED_FAST; > >>>> > >>>> - const: zicclsm > >>>> description: > >>>> The standard Zicclsm extension for misaligned support for all regular > >>>> load and store instructions (including scalar and vector) but not AMOs > >>>> or other specialized forms of memory access. Defined in the > >>>> RISC-V RVA Profiles Specification. > >>>> > >>>> Doesn't, unfortunately, say anywhere there that they're actually fast :( > >>> > >>> Oh no! That is unfortunate that the ISA does not explicitly call that > >>> out, but I think that acceptable. > >>> > >>> If a vendor puts Zicclsm in their isa string, they should expect > >>> software to take advantage of misaligned accesses. FAST is our signal to > >>> tell software that they should emit misaligned accesses. > >> > >> AFAIK, Zicclsm is not even an ISA extension, simply a profile > >> specification which means that only the execution environment which > >> provides the profile support misaligned accesses (cf > >> https://lists.riscv.org/g/tech-profiles/message/56). > > > > I dunno, the specification status page used to describe it as an > > extension: > > https://wiki.riscv.org/display/HOME/Specification+Status+-+Historical > > My understanding was that these could be considered extensions, just > > like we are considering svade to be one. > > > >> . I don't think we > >> can extrapolate that the misaligned accesses will be fast at all. > > > > That is my opinion on it too. If it doesn't say "fast" and give a > > definition for what that means in the binding, then we can't assume that > > it is fast. I'm also wary of extending definitions of extensions in the > > binding, because a) I am 90% sure that people writing devicetrees don't > > care and b) it'd be a potential difference between DT and ACPI without a > > real justification (unlike the zkr or svade/svadu situations). > > BTW, the profile spec [1] has a note that states the following for Zicclsm: > > "Even though mandated, misaligned loads and stores might execute > extremely slowly. Standard software distributions should assume their > existence only for correctness, not for performance." > > Which was also quoted in patch 1, so I guess that settles it. The intention here was to allow vendors to configure an option to skip the probing. This extension does not seem useful as it is written! A way around this would be to add a kernel arg to set the access speed but maybe it doesn't matter. For the sake of this patch, it looks like we should get rid of this Zicclsm check. - Charlie > > Thanks, > > Clément > > Link: > https://github.com/riscv/riscv-profiles/blob/main/src/profiles.adoc?plain=1#L524 > [1] > > > > >>> This allows for a generic kernel, like the one a distro would compile, to > >>> skip the probing when booting on a system that explicitly called out > >>> that the hardware supports misaligned accesses.