On Fri, Feb 10, 2023 at 09:05:15AM +0100, Andrew Jones wrote: > On Thu, Feb 09, 2023 at 07:09:53PM +0000, Conor Dooley wrote: > > On Thu, Feb 09, 2023 at 04:26:26PM +0100, Andrew Jones wrote: > > > Using memset() to zero a 4K page takes 563 total instructions, where > > > 20 are branches. clear_page(), with Zicboz and a 64 byte block size, > > > takes 169 total instructions, where 4 are branches and 33 are nops. > > > Even though the block size is a variable, thanks to alternatives, we > > > can still implement a Duff device without having to do any preliminary > > > calculations. This is achieved by taking advantage of 'vendor_id' > > > being used as application-specific data for alternatives, enabling us > > > to stop patching / unrolling when 4K bytes have been zeroed (we would > > > loop and continue after 4K if the page size would be larger) > > > > > > For 4K pages, unrolling 16 times allows block sizes of 64 and 128 to > > > only loop a few times and larger block sizes to not loop at all. Since > > > cbo.zero doesn't take an offset, we also need an 'add' after each > > > instruction, making the loop body 112 to 160 bytes. Hopefully this > > > is small enough to not cause icache misses. > > > > > > Signed-off-by: Andrew Jones <ajones@xxxxxxxxxxxxxxxx> > > > Acked-by: Conor Dooley <conor.dooley@xxxxxxxxxxxxx> > > > > > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c > > > index 74736b4f0624..42246bbfa532 100644 > > > --- a/arch/riscv/kernel/cpufeature.c > > > +++ b/arch/riscv/kernel/cpufeature.c > > > @@ -280,6 +280,17 @@ void __init riscv_fill_hwcap(void) > > > #ifdef CONFIG_RISCV_ALTERNATIVE > > > static bool riscv_cpufeature_application_check(u32 feature, u16 data) > > > { > > > + switch (feature) { > > > + case RISCV_ISA_EXT_ZICBOZ: > > > + /* > > > + * Zicboz alternative applications provide the maximum > > > > I like the comment, rather than this being some wizardry. > > I find the word "applications" to be a little unclear, perhaps, iff this > > series needs a respin, this would work better as "Users of the Zicboz > > alternative provide..." (or s/Users/Callers)? > > Right, "applications" is an overloaded word. "users" is probably a better > choice. "callers" isn't quite right, to me, since it's a code patching > "application" / "use". Do you think the function name should change as > well? I was initially going to suggest that too, but then couldn't really think of something better. s/application_check/check_applies/ maybe? > > > + * supported block size order, or zero when it doesn't > > > + * matter. If the current block size exceeds the maximum, > > > + * then the alternative cannot be applied. > > > + */ > > > + return data == 0 || riscv_cboz_block_size <= (1U << data); > > > + } > > > + > > > return data == 0; > > > }
Attachment:
signature.asc
Description: PGP signature