On Wed 26-10-22 17:38:06, Aneesh Kumar K V wrote: > On 10/26/22 4:32 PM, Michal Hocko wrote: > > On Wed 26-10-22 16:12:25, Aneesh Kumar K V wrote: > >> On 10/26/22 2:49 PM, Michal Hocko wrote: > >>> On Wed 26-10-22 16:00:13, Feng Tang wrote: > >>>> On Wed, Oct 26, 2022 at 03:49:48PM +0800, Aneesh Kumar K V wrote: > >>>>> On 10/26/22 1:13 PM, Feng Tang wrote: > >>>>>> In page reclaim path, memory could be demoted from faster memory tier > >>>>>> to slower memory tier. Currently, there is no check about cpuset's > >>>>>> memory policy, that even if the target demotion node is not allowd > >>>>>> by cpuset, the demotion will still happen, which breaks the cpuset > >>>>>> semantics. > >>>>>> > >>>>>> So add cpuset policy check in the demotion path and skip demotion > >>>>>> if the demotion targets are not allowed by cpuset. > >>>>>> > >>>>> > >>>>> What about the vma policy or the task memory policy? Shouldn't we respect > >>>>> those memory policy restrictions while demoting the page? > >>>> > >>>> Good question! We have some basic patches to consider memory policy > >>>> in demotion path too, which are still under test, and will be posted > >>>> soon. And the basic idea is similar to this patch. > >>> > >>> For that you need to consult each vma and it's owning task(s) and that > >>> to me sounds like something to be done in folio_check_references. > >>> Relying on memcg to get a cpuset cgroup is really ugly and not really > >>> 100% correct. Memory controller might be disabled and then you do not > >>> have your association anymore. > >>> > >> > >> I was looking at this recently and I am wondering whether we should worry about VM_SHARE > >> vmas. > >> > >> ie, page_to_policy() can just reverse lookup just one VMA and fetch the policy right? > > > > How would that help for private mappings shared between parent/child? > > > this is MAP_PRIVATE | MAP_SHARED? This is not a valid combination IIRC. What I meant is a simple MAP_PRIVATE|MAP_ANON that is CoW shared between parent and child. [...] -- Michal Hocko SUSE Labs