On Mon, May 15 2023 at 22:31, Russell King wrote: > On Mon, May 15, 2023 at 11:11:45PM +0200, Thomas Gleixner wrote: >> But that's not necessarily true for ARM32 as there are no IPIs involved >> on the machine we are using, which is a dual-core Cortex-A9. >> >> So I came up with the hack below, which is equally fast as the full >> flush variant while the performance impact on the other CPUs is minimally >> lower according to perf. >> >> That probably should have another argument which tells how many TLBs >> this flush affects, i.e. 3 in this example, so an architecture can >> sensibly decide whether it wants to use flush all or not. >> @@ -1747,7 +1748,12 @@ static bool __purge_vmap_area_lazy(unsig >> list_last_entry(&local_purge_list, >> struct vmap_area, list)->va_end); >> >> - flush_tlb_kernel_range(start, end); >> + if (tmp.va_end > tmp.va_start) >> + list_add(&tmp.list, &local_purge_list); >> + flush_tlb_kernel_vas(&local_purge_list); >> + if (tmp.va_end > tmp.va_start) >> + list_del(&tmp.list); > > So basically we end up iterating over each VA range, which seems > sensible if the range is large and we have to iterate over it page > by page. Right. > In the case you have, are "start" and "end" set on function entry > to a range, or are they set to ULONG_MAX,0 ? What I'm wondering is > whether we could get away with just having flush_tlb_kernel_vas(). > > Whether that's acceptable to others is a different question :) As I said flush_tlb_kernel_vas() should be void flush_tlb_kernel_vas(struct list_head *list, unsigned int num_entries): So that an architecture can decide whether it's worth to do walk the entries or whether it resorts to a flush all. >> +static void do_flush_vas(void *arg) >> +{ >> + struct list_head *list = arg; >> + struct vmap_area *va; >> + unsigned long addr; >> + >> + list_for_each_entry(va, list, list) { >> + /* flush range by one by one 'invlpg' */ >> + for (addr = va->va_start; addr < va->va_end; addr += PAGE_SIZE) >> + flush_tlb_one_kernel(addr); > > Isn't this just the same as: > flush_tlb_kernel_range(va->va_start, va->va_end); Indeed. Thanks, tglx