On 05/16/23 at 08:40am, Thomas Gleixner wrote: > On Tue, May 16 2023 at 10:26, Baoquan He wrote: > > On 05/15/23 at 08:17pm, Uladzislau Rezki wrote: > >> For systems which lack a full TLB flush and to flush a long range is > >> a problem(it takes time), probably we can flush VA one by one. Because > >> currently we calculate a flush range [min:max] and that range includes > >> the space that might not be mapped at all. Like below: > > > > It's fine if we only calculate a flush range of [min:max] with VA. In > > vm_reset_perms(), it calculates the flush range with the impacted direct > > mapping range, then merge it with VA's range. That looks really strange > > and surprising. If the vm->pages[] are got from a lower part of physical > > memory, the final merged flush will span tremendous range. Wondering why > > we need merge the direct map range with VA range, then do flush. Not > > sure if I misunderstand it. > > So what happens on this BPF teardown is: > > The vfree(8k) ends up flushing 3 entries. The actual vmalloc part (2) and > one extra which is in the direct map. I haven't verified that yet, but I > assume it's the alias of one of the vmalloc'ed pages. It looks like the reason. As Uladzislau pointed out, ARCH-es may have full TLB flush, so won't get trouble from the merged flush in the calculated [min:max] way, e.g arm64 and x86's flush_tlb_kernel_range(). However, arm32 seems lacking the ability of full TLB flash. If agreed, I can make a draft patch to do the flush for direct map and VA seperately, see if it works.