Re: [v2 PATCH 2/2] mm: mempolicy: handle vma with unmovable pages mapped correctly in mbind

Vlastimil Babka <vbabka@xxxxxxx> · Wed, 17 Jul 2019 20:50:05 +0200

On 7/17/19 8:23 PM, Yang Shi wrote:
> 
> 
> On 7/16/19 10:28 AM, Yang Shi wrote:
>>
>>
>> On 7/16/19 5:07 AM, Vlastimil Babka wrote:
>>> On 6/22/19 2:20 AM, Yang Shi wrote:
>>>> @@ -969,10 +975,21 @@ static long do_get_mempolicy(int *policy, 
>>>> nodemask_t *nmask,
>>>>   /*
>>>>    * page migration, thp tail pages can be passed.
>>>>    */
>>>> -static void migrate_page_add(struct page *page, struct list_head 
>>>> *pagelist,
>>>> +static int migrate_page_add(struct page *page, struct list_head 
>>>> *pagelist,
>>>>                   unsigned long flags)
>>>>   {
>>>>       struct page *head = compound_head(page);
>>>> +
>>>> +    /*
>>>> +     * Non-movable page may reach here.  And, there may be
>>>> +     * temporaty off LRU pages or non-LRU movable pages.
>>>> +     * Treat them as unmovable pages since they can't be
>>>> +     * isolated, so they can't be moved at the moment.  It
>>>> +     * should return -EIO for this case too.
>>>> +     */
>>>> +    if (!PageLRU(head) && (flags & MPOL_MF_STRICT))
>>>> +        return -EIO;
>>>> +
>>> Hm but !PageLRU() is not the only way why queueing for migration can
>>> fail, as can be seen from the rest of the function. Shouldn't all cases
>>> be reported?
>>
>> Do you mean the shared pages and isolation failed pages? I'm not sure 
>> whether we should consider these cases break the semantics or not, so 
>> I leave them as they are. But, strictly speaking they should be 
>> reported too, at least for the isolation failed page.

CC'd linux-api, should be done on v3 posting also.

> By reading mbind man page, it says:
> 
> If MPOL_MF_MOVE is specified in flags, then the kernel will attempt to 
> move all the existing pages in the memory range so that they follow the 
> policy.  Pages that are shared with other processes will not be moved.  
> If MPOL_MF_STRICT is also specified, then the call fails with the error 
> EIO if some pages could not be moved.

I don't think this means that for shared pages, -EIO should not be
reported. I can imagine both interpretations of the paragraph. I guess
we can be conservative and keep not reporting them, if that was always
the case - but then perhaps clarify the man page?

> It looks the code already handles shared page correctly, we just need 
> return -EIO for isolation failed page if MPOL_MF_STRICT is specified.
> 
>>
>> Thanks,
>> Yang
>>
>>>
>>>>       /*
>>>>        * Avoid migrating a page that is shared with others.
>>>>        */
>>>> @@ -984,6 +1001,8 @@ static void migrate_page_add(struct page *page, 
>>>> struct list_head *pagelist,
>>>>                   hpage_nr_pages(head));
>>>>           }
>>>>       }
>>>> +
>>>> +    return 0;
>>>>   }
>>>>     /* page allocation callback for NUMA node migration */
>>>> @@ -1186,9 +1205,10 @@ static struct page *new_page(struct page 
>>>> *page, unsigned long start)
>>>>   }
>>>>   #else
>>>>   -static void migrate_page_add(struct page *page, struct list_head 
>>>> *pagelist,
>>>> +static int migrate_page_add(struct page *page, struct list_head 
>>>> *pagelist,
>>>>                   unsigned long flags)
>>>>   {
>>>> +    return -EIO;
>>>>   }
>>>>     int do_migrate_pages(struct mm_struct *mm, const nodemask_t *from,
>>>>
>>
>