On 05/14/2018 10:28 AM, Boaz Harrosh wrote: > The VM_LOCAL_CPU flag tells the Kernel that the vma will be used > from a single-core only, and therefore invalidation (flush_tlb) of > PTE(s) need not be a wide CPU scheduling. This doesn't work on x86. We load TLB entries for lots of reasons, even if the PTE is never "used". Is there another architecture you had in mind that has more predictable TLB population behavior?