On 30/08/17 09:20, Leo Yan wrote: > Hi Marc, > > On Wed, Aug 30, 2017 at 09:02:47AM +0100, Marc Zyngier wrote: >> + Mark >> >> On 30/08/17 06:06, Leo Yan wrote: >>> commit fa8d815fac96e7c9247783d5a1f8fa4685b3c543 upstream. >>> >>> Cortex-A73 (all versions) counter read can return a wrong value >>> when the counter crosses a 32bit boundary. >>> >>> The workaround involves performing the read twice, and to return >>> one or the other depending on whether a transition has taken place. >>> >>> Signed-off-by: Marc Zyngier <marc.zyngier@xxxxxxx> >>> Signed-off-by: Leo Yan <leo.yan@xxxxxxxxxx> >>> --- >>> arch/arm64/Kconfig | 12 ++++++++++++ >>> arch/arm64/include/asm/arch_timer.h | 11 +++++++++++ >>> 2 files changed, 23 insertions(+) >>> >>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig >>> index 14cdc6d..68e7c98 100644 >>> --- a/arch/arm64/Kconfig >>> +++ b/arch/arm64/Kconfig >>> @@ -374,6 +374,18 @@ config ARM64_ERRATUM_843419 >>> >>> If unsure, say Y. >>> >>> +config ARM64_ERRATUM_858921 >>> + bool "Cortex-A73: 858921: arch timer counter read can return a wrong value" >>> + default y >>> + depends on ARM_ARCH_TIMER && ARM64 >>> + help >>> + This option enables a workaround applicable to Cortex-A73 >>> + (all versions), whose counter may return incorrect values. >>> + The workaround will be dynamically enabled when an affected >>> + core is detected. >>> + >>> + If unsure, say Y. >>> + >>> config CAVIUM_ERRATUM_22375 >>> bool "Cavium erratum 22375, 24313" >>> default y >>> diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h >>> index fbe0ca3..9b2b0f5 100644 >>> --- a/arch/arm64/include/asm/arch_timer.h >>> +++ b/arch/arm64/include/asm/arch_timer.h >>> @@ -114,6 +114,16 @@ static inline u64 arch_counter_get_cntpct(void) >>> return 0; >>> } >>> >>> +#ifdef CONFIG_ARM64_ERRATUM_858921 >>> +static inline u64 arch_counter_get_cntvct(void) >>> +{ >>> + u64 old, new; >>> + >>> + asm volatile("mrs %0, cntvct_el0" : "=r" (old)); >>> + asm volatile("mrs %0, cntvct_el0" : "=r" (new)); >>> + return (((old ^ new) >> 32) & 1) ? old : new; >>> +} >>> +#else >>> static inline u64 arch_counter_get_cntvct(void) >>> { >>> u64 cval; >>> @@ -123,6 +133,7 @@ static inline u64 arch_counter_get_cntvct(void) >>> >>> return cval; >>> } >>> +#endif >>> >>> static inline int arch_timer_arch_init(void) >>> { >>> >> >> This is completely busted: >> >> - You are only addressing the kernel side, and ignore userspace (which >> is just as broken as the kernel). > > Thanks for quick reviewing. > > Yeah, I remembered there have some code directly reads arch timer > virtual counter from userspace. But can you remind for userspace > broken issue, is the code in libc or vdso? You're missing the point. The virtual counter is freely available to userspace to play with (this is a de-facto ABI). Since it can return bad values, it needs to be trapped to be correctly emulated (together with cntfrq_el0), and the VDSO disabled. > Here I have another question is: after applied the whole workaround > infrastructure, can it also fix userspace broken issue? Yup. That's why I ended-up with a 18 patches series, and not just this single one. Trust me, I'm lazy. There is nothing I hate more than doing useless work. >> - You lie in the description of the option (this is in no way dynamic, >> since you didn't backport the whole workaround infrastructure). > > I should fix it. > >> So I'm afraid I'm NAKing this. Please refrain from blindly backporting >> random patches. fa8d815fac96e7c9 only makes sense in the context of the >> whole series, and on its own gives you a very false sense of having >> properly addressed it. > > IIUC, at least fa8d815fac96e7c9 can fix issue in kernel side, such like > for sched_clock() roll back issue [1]. What's the point of fixing the kernel if userspace is just as likely to fail? > So for this issue, are you suggesting we need backport whole workaround > infrastructure onto kernel 4.4 and 4.9? Many ARM devices are working > with these two kernels. Then these systems are completely broken if they use a Cortex-A73. Either they run mainline (which will be just fine), or they get a fully backported workaround infrastructure. Thanks, M. -- Jazz is not dead. It just smells funny...