Re: mbind() breaks its API definition since v5.2 by commit d883544515aa (mm: mempolicy: make the behavior consistent when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019-10-30 at 12:32 Yang Shi wrote:
>
>
>On 10/29/19 8:12 PM, Li Xinhai wrote:
>> On 2019-10-30 at 10:50 Yang Shi wrote:
>>>
>>> On 10/29/19 7:27 PM, Li Xinhai wrote:
>>>> One change in do_mbind() of this commit has suspicious usage of return value of
>>>> queue_pages_range(), excerpt as below:
>>>>
>>>> ---
>>>> @@ -1243,10 +1265,15 @@ static long do_mbind(unsigned long start, unsigned long len,
>>>>     	if (err)
>>>>     	goto mpol_out;
>>>>   
>>>> -	err = queue_pages_range(mm, start, end, nmask,
>>>> +	ret = queue_pages_range(mm, start, end, nmask,
>>>>     	 flags | MPOL_MF_INVERT, &pagelist);
>>>> -	if (!err)
>>>> -	err = mbind_range(mm, start, end, new);
>>>> +
>>>> +	if (ret < 0) {      /////// convert to all possible 'ret' to '-EIO' <<<<
>>>> +	err = -EIO;
>>>> +	goto up_out;
>>>> +	}
>>>> +
>>>> +	err = mbind_range(mm, start, end, new);
>>>>   
>>>>     	if (!err) {
>>>>     	int nr_failed = 0;
>>>> ---
>>>>
>>>> Note that inside queue_pages_range(), the call to walk_page_range() may return
>>>> errors from 'test_walk' of 'struct mm_walk_ops', e.g. -EFAULT. Now, those error
>>>> codes are no longer reported to user space application.
>>>>
>>>>    From user space, the mbind() call need to reported error, with EFAULT, as example:
>>>> EFAULT
>>>> Part or all of the memory range specified by nodemask and maxnode points
>>>> outside your accessible address space. Or, there was an unmapped hole in the
>>>> specified memory range specified by addr and len.
>>> Thanks for catching this. That commit was aimed to correct the return
>>> values for some corner cases in mbind(), but it should not alter the
>>> errno for other failure cases, i.e. -EFAULT.
>>>
>>> Could you please try the below patch (build test only)?
>>>
>>> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>>> index 4ae967b..99df43a 100644
>>> --- a/mm/mempolicy.c
>>> +++ b/mm/mempolicy.c
>>> @@ -1286,7 +1286,7 @@ static long do_mbind(unsigned long start, unsigned
>>> long len,
>>>                            flags | MPOL_MF_INVERT, &pagelist);
>>>
>>>          if (ret < 0) {
>>> -               err = -EIO;
>>> +               err = ret;
>>>                  goto up_out;
>>>          }
>>>
>>>
>> This seems do not work, because the 'pagelist' would have some pages queued
>> into it, need to put back those pages instead of return quickly.
>>
>> So, we need to remove this page leak as well. <<<<<<
>>
>> In my understanding, revert the changes as I quoted above may solve it, but not sure
>> the details about changes at end of do_mbind(), should keep them at there without
>> further change?
>
>Thanks for pointing this out. We don't have to revert this commit to
>handle the non-empty pagelist correctly. The simplest way is to just put
>those pages back and I'm supposed this is also the preferred way since
>mbind_range() is not called to really apply the policy so those pages
>should not be migrated.
>
>The below patch should solve this:
>
>diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>index 4ae967b..d80025c 100644
>--- a/mm/mempolicy.c
>+++ b/mm/mempolicy.c
>@@ -1286,7 +1286,10 @@ static long do_mbind(unsigned long start,
>unsigned long len,
>                           flags | MPOL_MF_INVERT, &pagelist);
>
>         if (ret < 0) {
>-               err = -EIO;
>+               if (!list_empty(&pagelist))
>+                       putback_movable_pages(&pagelist);
>+
>+               err = ret;
>                 goto up_out;
>         }
> 
Checked the original commit about the 'ret >0' purpose, and understood that 
was for migrating page in best effort way.

Your patch looks correct to me.

>>
>> - Xinhai
>>
>>>> Please correct me if this is the intended change(and will have updated API
>>>> definition), or something was misunderstood.
>>>>
>>>> -Xinhai
>> >
>
>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux