On 20-11-06 15:28:59, Huang, Ying wrote: > Mel Gorman <mgorman@xxxxxxx> writes: > > > On Wed, Nov 04, 2020 at 01:36:58PM +0800, Huang, Ying wrote: > >> But from another point of view, I suggest to remove the constraints of > >> MPOL_F_MOF in the future. If the overhead of AutoNUMA isn't acceptable, > >> why not just disable AutoNUMA globally via sysctl knob? > >> > > > > Because it's a double edged sword. NUMA Balancing can make a workload > > faster while still incurring more overhead than it should -- particularly > > when threads are involved rescanning the same or unrelated regions. > > Global disabling only really should happen when an application is running > > that is the only application on the machine and has full NUMA awareness. > > Got it. So NUMA Balancing may in generally benefit some workloads but > hurt some other workloads on one machine. So we need a method to > enable/disable NUMA Balancing for one workload. Previously, this is > done via the explicit NUMA policy. If some explicit NUMA policy is > specified, NUMA Balancing is disabled for the memory region or the > thread. And this can be reverted again for a memory region via > MPOL_MF_LAZY. It appears that we lacks MPOL_MF_LAZY for the thread yet. > > >> > It might still end up being better but I was not aware of a > >> > *realistic* workload that binds to multiple nodes > >> > deliberately. Generally I expect if an application is binding, it's > >> > binding to one local node. > >> > >> Yes. It's not popular configuration for now. But for the memory > >> tiering system with both DRAM and PMEM, the DRAM and PMEM in one socket > >> will become 2 NUMA nodes. To avoid too much cross-socket memory > >> accessing, but take advantage of both the DRAM and PMEM, the workload > >> can be bound to 2 NUMA nodes (DRAM and PMEM). > >> > > > > Ok, that may lead to unpredictable performance as it'll have variable > > performance with limited control of the "important" applications that > > should use DRAM over PMEM. That's a long road but the step is not > > incompatible with the long-term goal. > > Yes. Ben Widawsky is working on a patchset to make it possible to > prefer the remote DRAM instead of the local PMEM as follows, > > https://lore.kernel.org/linux-mm/20200630212517.308045-1-ben.widawsky@xxxxxxxxx/ > > Best Regards, > Huang, Ying > Rebased version was posted here: https://lore.kernel.org/linux-mm/20201030190238.306764-1-ben.widawsky@xxxxxxxxx/ Thanks. Ben