Re: [RFC PATCH 2/2] mm/gup: introduce vaddr_pin_pages_remote()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/13/19 2:08 PM, Ira Weiny wrote:
> On Mon, Aug 12, 2019 at 05:07:32PM -0700, John Hubbard wrote:
>> On 8/12/19 4:49 PM, Ira Weiny wrote:
>>> On Sun, Aug 11, 2019 at 06:50:44PM -0700, john.hubbard@xxxxxxxxx wrote:
>>>> From: John Hubbard <jhubbard@xxxxxxxxxx>
>> ...
>>> Thinking about this part of the patch... is this pin really necessary?  This
>>> code is not doing a long term pin.  The page just needs a reference while we
>>> map it into the devices page tables.  Once that is done we should get notifiers
>>> if anything changes and we can adjust.  right?
>>>
>>
>> OK, now it's a little interesting: the FOLL_PIN is necessary, but maybe not
>> FOLL_LONGTERM. Illustrating once again that it's actually necessary to allow
>> these flags to vary independently.
> 
> Why is PIN necessary?  I think we do want all drivers to use the new
> user_uaddr_vaddr_pin_user_pages() call...  :-P  But in this case I think a
> simple "get" reference is enough to reference the page while we are using it.
> If it changes after the "put/unpin" we get a fault which should handle the
> change right?
> 

FOLL_PIN is necessary because the caller is clearly in the use case that 
requires it--however briefly they might be there. As Jan described it,

"Anything that gets page reference and then touches page data (e.g. direct IO)
needs the new kind of tracking so that filesystem knows someone is messing with
the page data." [1]

> The other issue I have with FOLL_PIN is what does it mean to call "...pin...()"
> without FOLL_PIN?
> 
> This is another confusion of get_user_pages()...  you can actually call it
> without FOLL_GET...  :-/  And you just don't get pages back.  I've never really
> dug into how (or if) you "put" them later...
>

Yes, you are talking to someone who has been suffering through that. The
problem here is that gup() has evolved into a do-everything tool. I think we're
getting closer to teasing it apart into more specific interfaces that do
more limited things.

Anyway, I want FOLL_PIN as a way to help clarify this...

 
>>
>> And that leads to another API refinement idea: let's set FOLL_PIN within the
>> vaddr_pin_pages*() wrappers, and set FOLL_LONGTER in the *callers* of those
>> wrappers, yes?
> 
> I've thought about this before and I think any default flags should simply
> define what we want follow_pages to do.

Hmm, so don't forget that we need to know what gup_fast() should do, too.

Anyway, I'm not worried about which combination of wrapper calls set which flags,
I'm open to suggestion there. But it does still seem to me that we should have
independent FOLL_LONGTERM and FOLL_PIN flags, once the API churn settles. 


> 
> Also, the addition of vaddr_pin information creates an implicit flag which if
> not there disallows any file pages from being pinned.  It becomes our new
> "longterm" flag.  FOLL_PIN _could_ be what we should use "internally".  But we
> could also just use this implicit vaddr_pin flag and not add a new flag.

I'd like to have FOLL_PIN internally, in order to solve the problems that
you just raised, above! Namely, that it's too hard to figure out all the
cases that gup() is handling.

With FOLL_PIN, we know that we should be taking the new pin refcount, and
releasing it via the (currently named) put_user_page*(), ultimately.

> 
> Finally, I struggle with converting everyone to a new call.  It is more
> overhead to use vaddr_pin in the call above because now the GUP code is going
> to associate a file pin object with that file when in ODP we don't need that
> because the pages can move around.

What if the pages in ODP are file-backed? 

> 
> This overhead may be fine, not sure in this case, but I don't see everyone
> wanting it.

I think most callers don't have much choice, otherwise they'll be broken with
file-backed memory.

thanks,
-- 
John Hubbard
NVIDIA



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux