On Thu, 18 May 2017, Vlastimil Babka wrote: > > The race is where? If you expand the node set during the move of the > > application then you are safe in terms of the legacy apps that did not > > include static bindings. > > No, that expand/shrink by itself doesn't work against parallel Parallel? I think we are clear that ithis is inherently racy against the app changing policies etc etc? There is a huge issue there already. The app needs to be well behaved in some heretofore undefined way in order to make moves clean. > get_page_from_freelist going through a zonelist. Moving from node 0 to > 1, with zonelist containing nodes 1 and 0 in that order: > > - mempolicy mask is 0 > - zonelist iteration checks node 1, it's not allowed, skip There is an allocation from node 1? This is not allowed before the move. So it should fail. Not skipping to another node. > - mempolicy mask is 0,1 (expand) > - mempolicy mask is 1 (shrink) > - zonelist iteration checks node 0, it's not allowed, skip > - OOM Are you talking about a race here between zonelist scanning and the moving? That has been there forever. And frankly there are gazillions of these races. The best thing to do is to get the cpuset moving logic out of the kernel and into user space. Understand that this is a heuristic and maybe come up with a list of restrictions that make an app safe. An safe app that can be moved must f.e 1. Not allocate new memory while its being moved 2. Not change memory policies after its initialization and while its being moved. 3. Not save memory policy state in some variable (because the logic to translate the memory policies for the new context cannot find it). ... Again cpuset process migration is a huge mess that you do not want to have in the kernel and AFAICT this is a corner case with difficult semantics. Better have that in user space... -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html