On Mon, May 15, 2023 at 06:43:40PM +0200, Thomas Gleixner wrote: > Folks! > > We're observing massive latencies and slowdowns on ARM32 machines due to > excessive TLB flush ranges. > > Those can be observed when tearing down a process, which has a seccomp > BPF filter installed. ARM32 uses the vmalloc area for module space. > > bpf_prog_free_deferred() > vfree() > _vm_unmap_aliases() > collect_per_cpu_vmap_blocks: start:0x95c8d000 end:0x95c8e000 size:0x1000 > __purge_vmap_area_lazy(start:0x95c8d000, end:0x95c8e000) > > va_start:0xf08a1000 va_end:0xf08a5000 size:0x00004000 gap:0x5ac13000 (371731 pages) > va_start:0xf08a5000 va_end:0xf08a9000 size:0x00004000 gap:0x00000000 ( 0 pages) > va_start:0xf08a9000 va_end:0xf08ad000 size:0x00004000 gap:0x00000000 ( 0 pages) > va_start:0xf08ad000 va_end:0xf08b1000 size:0x00004000 gap:0x00000000 ( 0 pages) > va_start:0xf08b3000 va_end:0xf08b7000 size:0x00004000 gap:0x00002000 ( 2 pages) > va_start:0xf08b7000 va_end:0xf08bb000 size:0x00004000 gap:0x00000000 ( 0 pages) > va_start:0xf08bb000 va_end:0xf08bf000 size:0x00004000 gap:0x00000000 ( 0 pages) > va_start:0xf0a15000 va_end:0xf0a17000 size:0x00002000 gap:0x00156000 ( 342 pages) > > flush_tlb_kernel_range(start:0x95c8d000, end:0xf0a17000) > > Does 372106 flush operations where only 31 are useful So, you asked the architecture to flush a large range, and are then surprised if it takes a long time. There is no way to know how many of those are useful. Now, while using the sledge hammer of flushing all TLB entries may sound like a good answer, if we're only evicting 31 entries, the other entries are probably useful to have, no? I think that you'd only run into this if you had a huge BPF program and you tore it down, no? -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!