On Tue, May 16, 2023 at 04:38:58PM +0200, Thomas Gleixner wrote: > On Tue, May 16 2023 at 15:42, Uladzislau Rezki wrote: > >> _vm_unmap_aliases() collects dirty ranges from per cpu vmap_block_queue > >> (what ever that is) and hands a start..end range to > >> __purge_vmap_area_lazy(). > >> > >> As I pointed out already, this can also end up being an excessive range > >> because there is no guarantee that those individual collected ranges are > >> consecutive. Though I have no idea how to cure that right now. > >> > >> AFAICT this was done to spare flush IPIs, but the mm folks should be > >> able to explain that properly. > >> > > This is done to prevent generating IPIs. That is why the whole range is > > calculated once and a flush occurs only once for all lazily registered VAs. > > Sure, but you pretty much enforced flush_tlb_all() by doing that, which > is not even close to correct. > > This range calculation is only correct when the resulting coalesced > range is consecutive, but if the resulting coalesced range is huge with > large holes and only a few pages to flush, then it's actively wrong. > > The architecture has zero chance to decide whether it wants to flush > single entries or all in one go. > Id depends what is a corner case what is not. Usually all allocations are done sequentially. From the other hand it is not always true. A good example is a module loading/unloading(it has a special place in vmap space). In this scenario we are quite far in vmap space from for example VMALLOC_START point. So it will require a flush_tlb_all, yes. > > There is a world outside of x86, but even on x86 it's borderline silly > to take the whole TLB out when you can flush 3 TLB entries one by one > with exactly the same number of IPIs, i.e. _one_. No? > I meant if we invoke flush_tlb_kernel_range() on each VA's individual range: <ARM> void flush_tlb_kernel_range(unsigned long start, unsigned long end) { if (tlb_ops_need_broadcast()) { struct tlb_args ta; ta.ta_start = start; ta.ta_end = end; on_each_cpu(ipi_flush_tlb_kernel_range, &ta, 1); } else local_flush_tlb_kernel_range(start, end); broadcast_tlb_a15_erratum(); } <ARM> we should IPI and wait, no? -- Uladzislau Rezki