On Wed 07-09-22 21:50:24, Zhongkun He wrote: [...] > > Do you really need to change the policy itself or only the effective > > nodemask? Do you need any other policy than bind and preferred? > > Yes, we need to change the policy, not only his nodemask. we really want > policy is interleave, and extend it to weight-interleave. > Say something like the following > node weight > interleave: 0-3 1:1:1:1 default one by one > weight-interleave: 0-3 1:2:4:6 alloc pages by weight > (User set weight.) > In the actual usecase, the remaining resources of each node are different, > and the use of interleave cannot maximize the use of resources. OK, this seems a separate topic. It would be good to start by proposing that new policy in isolation with the semantic description. > Back to the previous question. > >The question is how to implement that with a sensible semantic. > > Thanks for your analysis and suggestions.It is really difficult to add > policy directly to cgroup for the hierarchical enforcement. It would be a > good idea to add pidfd_set_mempolicy. Are you going to pursue that path? > Also, there is a new idea. > We can try to separate the elements of mempolicy and use them independently. > Mempolicy has two meanings: > nodes:which nodes to use(nodes,0-3), we can use cpuset's effective_mems > directly. > mode:how to use them(bind,prefer,etc). change the mode to a > cpuset->flags,such as CS_INTERLEAVE。 > task_struct->mems_allowed is equal to cpuset->effective_mems,which is > hierarchical enforcement。CS_INTERLEAVE can also be updated into tasks, > just like other flags(CS_SPREAD_PAGE). > When a process needs to allocate memory, it can find the appropriate node to > allocate pages according to the flag and mems_allowed. I am not sure I see the advantage as the mode and nodes are always closely coupled. You cannot really have one wihtout the other. -- Michal Hocko SUSE Labs