Re: [PATCH v3 21/26] arm64: Introduce asm/vdso/arch_timer.h

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Vincenzo,

On Fri, Mar 13, 2020 at 03:43:40PM +0000, Vincenzo Frascino wrote:
> The vDSO library should only include the necessary headers required for
> a userspace library (UAPI and a minimal set of kernel headers). To make
> this possible it is necessary to isolate from the kernel headers the
> common parts that are strictly necessary to build the library.
> 
> Introduce asm/vdso/arch_timer.h to contain all the arm64 specific
> code. This allows to replace the second isb() in __arch_get_hw_counter()
> with a fake dependent stack read of the counter which improves the vdso
> library peformances of ~4.5%. Below the results of vdsotest [1] ran for
> 100 iterations.
> 
> Before the patch:
> =================
> clock-gettime-monotonic: syscall: 771 nsec/call
> clock-gettime-monotonic:    libc: 130 nsec/call
> clock-gettime-monotonic:    vdso: 111 nsec/call
> ...
> clock-gettime-realtime: syscall: 762 nsec/call
> clock-gettime-realtime:    libc: 130 nsec/call
> clock-gettime-realtime:    vdso: 111 nsec/call
> 
> After the patch:
> ================
> clock-gettime-monotonic: syscall: 792 nsec/call
> clock-gettime-monotonic:    libc: 124 nsec/call
> clock-gettime-monotonic:    vdso: 106 nsec/call
> ...
> clock-gettime-realtime: syscall: 776 nsec/call
> clock-gettime-realtime:    libc: 124 nsec/call
> clock-gettime-realtime:    vdso: 106 nsec/call
> 
> [1] https://github.com/nathanlynch/vdsotest
> 
> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> Cc: Will Deacon <will@xxxxxxxxxx>
> Cc: Marc Zyngier <maz@xxxxxxxxxx>
> Cc: Mark Rutland <Mark.Rutland@xxxxxxx>
> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@xxxxxxx>
> ---
>  arch/arm64/include/asm/arch_timer.h        | 29 ++++---------------
>  arch/arm64/include/asm/vdso/arch_timer.h   | 33 ++++++++++++++++++++++
>  arch/arm64/include/asm/vdso/gettimeofday.h |  7 +++--
>  3 files changed, 42 insertions(+), 27 deletions(-)
>  create mode 100644 arch/arm64/include/asm/vdso/arch_timer.h
> 
> diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h
> index 7ae54d7d333a..7f22cd00ad45 100644
> --- a/arch/arm64/include/asm/arch_timer.h
> +++ b/arch/arm64/include/asm/arch_timer.h
> @@ -164,24 +164,7 @@ static inline void arch_timer_set_cntkctl(u32 cntkctl)
>  	isb();
>  }
>  
> -/*
> - * Ensure that reads of the counter are treated the same as memory reads
> - * for the purposes of ordering by subsequent memory barriers.
> - *
> - * This insanity brought to you by speculative system register reads,
> - * out-of-order memory accesses, sequence locks and Thomas Gleixner.
> - *
> - * http://lists.infradead.org/pipermail/linux-arm-kernel/2019-February/631195.html
> - */
> -#define arch_counter_enforce_ordering(val) do {				\
> -	u64 tmp, _val = (val);						\
> -									\
> -	asm volatile(							\
> -	"	eor	%0, %1, %1\n"					\
> -	"	add	%0, sp, %0\n"					\
> -	"	ldr	xzr, [%0]"					\
> -	: "=r" (tmp) : "r" (_val));					\
> -} while (0)
> +#include <asm/vdso/arch_timer.h>
>  
>  static __always_inline u64 __arch_counter_get_cntpct_stable(void)
>  {
> @@ -189,7 +172,7 @@ static __always_inline u64 __arch_counter_get_cntpct_stable(void)
>  
>  	isb();
>  	cnt = arch_timer_reg_read_stable(cntpct_el0);
> -	arch_counter_enforce_ordering(cnt);
> +	cnt = arch_counter_enforce_ordering(cnt);
>  	return cnt;

Why have you changed the structure of arch_counter_enforce_ordering() to
return a value? The commit message has no rationale for that.

If there is a reason to change that, I'd prefer the driver change as one
patch, before moving the definition.

[...]

> +/*
> + * Ensure that reads of the counter are treated the same as memory reads
> + * for the purposes of ordering by subsequent memory barriers.
> + *
> + * This insanity brought to you by speculative system register reads,
> + * out-of-order memory accesses, sequence locks and Thomas Gleixner.
> + *
> + * http://lists.infradead.org/pipermail/linux-arm-kernel/2019-February/631195.html
> + *
> + */
> +static u64 arch_counter_enforce_ordering(u64 val)
> +{
> +	u64 tmp, _val = (val);
> +
> +	asm volatile(
> +	"	eor	%0, %1, %1\n"
> +	"	add	%0, sp, %0\n"
> +	"	ldr	xzr, [%0]"
> +	: "=r" (tmp) : "r" (_val));
> +
> +	return _val;
> +}

This change has no functional effect. Since `_val` is only passed in as
an input parameter, the compiler can assume the assembly has no effect
on it.

As above, what is the rationale for changing this?

> @@ -82,10 +83,10 @@ static __always_inline u64 __arch_get_hw_counter(s32 clock_mode)
>  	isb();
>  	asm volatile("mrs %0, cntvct_el0" : "=r" (res) :: "memory");
>  	/*
> -	 * This isb() is required to prevent that the seq lock is
> -	 * speculated.#
> +	 * arch_counter_enforce_ordering() is required to prevent that
> +	 * the seq lock is speculated.
>  	 */
> -	isb();
> +	res = arch_counter_enforce_ordering(res);

Can we delete the comment entirely? We don't bother in <asm/arch_timer.h>.

Even better, can we factor out __arch_counter_get_cntvct(), and use
that?

Thanks,
Mark.



[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux