On Thu, May 27, 2021 at 09:30:00AM +0200, Michal Hocko wrote: > On Wed 26-05-21 13:01:39, Feng Tang wrote: > > mempolicy_nodemask_intersects() is used in oom case to check if a > > task may have memory allocated on some memory nodes. > > > > Currently, the nodes_intersects() is run for both 'bind' and 'interleave' > > policies. But they are different regarding memory allocation, the nodemask > > is a forced requirement for 'bind', while just a hint for 'interleave'. > > Like in alloc_pages_vma(): > > > > nmask = policy_nodemask(gfp, pol); > > preferred_nid = policy_node(gfp, pol, node); > > page = __alloc_pages(gfp, order, preferred_nid, nmask); > > > > in plicy_nodemask(), only 'bind' policy may return its desired nodemask, > > while others return NULL. And this 'NULL' enables the 'interleave' policy > > can get memory from other nodes than its nodemask. > > > > So skip the nodemask intersect check for 'interleave' policy. > > The changelog is not really clear on the actual effect of the > patch and the above reference to alloc_pages_vma looks misleading to me > because that path is never called for interleaved policy. You are right. thanks for pointing it out. Only the 'bind' policy calls policy_nodemask() and gets its preset nodemask, while for 'interleave', alloc_page_interleave() calls __alloc_pages() with NULL nodemask, so the conclusion is the same that 'bind' policy can only get memory from its preset nodemask, while 'interleave' can get memory from all nodes. > This is very likely my fault because I was rather vague. The existing > code in its current form is confusing but it _works_ properly. The > problem is that it sounds like a general helper and in that regards > the function is correct for the interleaved policy and your proposed > preferred-many. But its only existing caller wants a different semantic. > > Until now this was not a real problem even for OOM context because > alloc_page_interleave is always used for the interleaving policy > and that one doesn't use any node mask so the code is not really > exercised. With your MPOL_PREFERRED this would no longer be the case. Given the 'interleave' task may have memory allocated from all nodes, shouldn't the mempolicy_nodemask_intersects() return true for 'interleave'? or I'm still missing something? > Your patch makes the code more robust for the oom context but it can > confuse other users who really want to do an intersect logic. So I think > it would really be best to rename the function and make it oom specific. > E.g. mempolicy_in_oom_domain(tsk, mask) this would make it clear that > this is not a general purpose function. Ok, will rename like this. Thanks, Feng > The changelog should be clear that this is just a code cleanup rather > than fix. > > -- > Michal Hocko > SUSE Labs