On Thu, Aug 10, 2023 at 09:58:43PM +0800, Chao Gao wrote: > On Thu, Aug 10, 2023 at 04:56:36PM +0800, Yan Zhao wrote: > >This is an RFC series trying to fix the issue of unnecessary NUMA > >protection and TLB-shootdowns found in VMs with assigned devices or VFIO > >mediated devices during NUMA balance. > > > >For VMs with assigned devices or VFIO mediated devices, all or part of > >guest memory are pinned for long-term. > > > >Auto NUMA balancing will periodically selects VMAs of a process and change > >protections to PROT_NONE even though some or all pages in the selected > >ranges are long-term pinned for DMAs, which is true for VMs with assigned > >devices or VFIO mediated devices. > > > >Though this will not cause real problem because NUMA migration will > >ultimately reject migration of those kind of pages and restore those > >PROT_NONE PTEs, it causes KVM's secondary MMU to be zapped periodically > >with equal SPTEs finally faulted back, wasting CPU cycles and generating > >unnecessary TLB-shootdowns. > > In my understanding, NUMA balancing also moves tasks closer to the memory > they are accessing. Can this still work with this series applied? > For pages protected with PROT_NONE in primary MMU in scanning phase, yes; For pages not set to PROT_NONE, no. Because looks this task_numa_migrate() is only triggered in next page fault when PROT_NONE and accessible VMA is found.