On Wed, Jul 03, 2019 at 11:32:32AM +0800, 王贇 wrote: > By tracing numa page faults, we recognize tasks sharing the same page, > and try pack them together into a single numa group. > > However when two task share lot's of cache pages while not much > anonymous pages, since numa balancing do not tracing cache page, they > have no chance to join into the same group. > > While tracing cache page cost too much, we could use some hints from I forgot; where again do we skip shared pages? task_numa_work() doesn't seem to skip file vmas. > userland and cpu cgroup could be a good one. > > This patch introduced new entry 'numa_group' for cpu cgroup, by echo > non-zero into the entry, we can now force all the tasks of this cgroup > to join the same numa group serving for task group. > > In this way tasks are more likely to settle down on the same node, to > share closer cpu cache and gain benefit from NUMA on both file/anonymous > pages. > > Besides, when multiple cgroup enabled numa group, they will be able to > exchange task location by utilizing numa migration, in this way they > could achieve single node settle down without breaking load balance. I dislike cgroup only interfaces; it there really nothing else we could use for this?