On 08/31/2018 05:12 PM, Hauke Mertens wrote: > On 08/30/2018 08:01 PM, Paul Burton wrote: >> When a system suffers from dcache aliasing a user program may observe >> stale VDSO data from an aliased cache line. Notably this can break the >> expectation that clock_gettime(CLOCK_MONOTONIC, ...) is, as its name >> suggests, monotonic. >> >> In order to ensure that users observe updates to the VDSO data page as >> intended, align the user mappings of the VDSO data page such that >> their cache colouring matches that of the virtual address range which >> the kernel will use to update the data page - typically its unmapped >> address within kseg0. >> >> This ensures that we don't introduce aliasing cache lines for the VDSO >> data page, and therefore that userland will observe updates without >> requiring cache invalidation. >> >> Signed-off-by: Paul Burton <paul.burton@xxxxxxxx> >> Reported-by: Hauke Mehrtens <hauke@xxxxxxxxxx> >> Reported-by: Rene Nielsen <rene.nielsen@xxxxxxxxxxxxx> >> Reported-by: Alexandre Belloni <alexandre.belloni@xxxxxxxxxxx> >> Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO") >> Cc: James Hogan <jhogan@xxxxxxxxxx> >> Cc: linux-mips@xxxxxxxxxxxxxx >> Cc: stable@xxxxxxxxxxxxxxx # v4.4+ > Tested-by: Hauke Mehrtens <hauke@xxxxxxxxxx> Tested-by: Rene Nielsen <rene.nielsen@xxxxxxxxxxxxx> > Without this patch ping shows these results on kernel 4.19-rc1 on the Lantiq VR9 SoC to a PC directly connected to the LAN port: > root@OpenWrt:~# ping 192.168.1.195 > PING 192.168.1.195 (192.168.1.195): 56 data bytes > 64 bytes from 192.168.1.195: seq=0 ttl=64 time=0.689 ms > 64 bytes from 192.168.1.195: seq=1 ttl=64 time=236.527 ms > 64 bytes from 192.168.1.195: seq=2 ttl=64 time=4294963.829 ms > 64 bytes from 192.168.1.195: seq=3 ttl=64 time=4294423.824 ms > 64 bytes from 192.168.1.195: seq=4 ttl=64 time=960.527 ms > 64 bytes from 192.168.1.195: seq=5 ttl=64 time=472.530 ms > 64 bytes from 192.168.1.195: seq=6 ttl=64 time=464.530 ms > 64 bytes from 192.168.1.195: seq=7 ttl=64 time=452.530 ms > > With this patch it looks like this: > >root@OpenWrt:~# ping 192.168.1.195 > PING 192.168.1.195 (192.168.1.195): 56 data bytes > 64 bytes from 192.168.1.195: seq=0 ttl=64 time=0.638 ms > 64 bytes from 192.168.1.195: seq=1 ttl=64 time=0.573 ms > 64 bytes from 192.168.1.195: seq=2 ttl=64 time=0.605 ms > 64 bytes from 192.168.1.195: seq=3 ttl=64 time=0.524 ms > 64 bytes from 192.168.1.195: seq=4 ttl=64 time=0.534 ms > 64 bytes from 192.168.1.195: seq=5 ttl=64 time=0.518 ms > 64 bytes from 192.168.1.195: seq=6 ttl=64 time=0.485 ms > 64 bytes from 192.168.1.195: seq=7 ttl=64 time=0.501 ms > > >> --- >> Hi Alexandre, >> >> Could you try this out on your Ocelot system? Hopefully it'll solve >> the problem just as well as James' patch but doesn't need the >> questionable change to arch_get_unmapped_area_common(). >> >> Thanks, >> Paul >> --- >> arch/mips/kernel/vdso.c | 20 ++++++++++++++++++++ >> 1 file changed, 20 insertions(+) >> >> diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c index >> 019035d7225c..5fb617a42335 100644 >> --- a/arch/mips/kernel/vdso.c >> +++ b/arch/mips/kernel/vdso.c >> @@ -13,6 +13,7 @@ >> #include <linux/err.h> >> #include <linux/init.h> >> #include <linux/ioport.h> >> +#include <linux/kernel.h> >> #include <linux/mm.h> >> #include <linux/sched.h> >> #include <linux/slab.h> >> @@ -20,6 +21,7 @@ >> >> #include <asm/abi.h> >> #include <asm/mips-cps.h> >> +#include <asm/page.h> >> #include <asm/vdso.h> >> >> /* Kernel-provided data used by the VDSO. */ @@ -128,12 +130,30 @@ >> int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp) >> vvar_size = gic_size + PAGE_SIZE; >> size = vvar_size + image->size; >> >> + /* >> + * Find a region that's large enough for us to perform the >> + * colour-matching alignment below. >> + */ >> + if (cpu_has_dc_aliases) >> + size += shm_align_mask + 1; >> + >> base = get_unmapped_area(NULL, 0, size, 0, 0); >> if (IS_ERR_VALUE(base)) { >> ret = base; >> goto out; >> } >> >> + /* >> + * If we suffer from dcache aliasing, ensure that the VDSO data page is >> + * coloured the same as the kernel's mapping of that memory. This >> + * ensures that when the kernel updates the VDSO data userland will see >> + * it without requiring cache invalidations. >> + */ >> + if (cpu_has_dc_aliases) { >> + base = __ALIGN_MASK(base, shm_align_mask); >> + base += ((unsigned long)&vdso_data - gic_size) & shm_align_mask; >> + } >> + >> data_addr = base + gic_size; >> vdso_addr = data_addr + PAGE_SIZE; >> >>