Re: [PATCH RFC 0/9] mm, sparse-vmemmap: Introduce compound pagemaps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/23/21 4:44 PM, Dan Williams wrote:
> On Tue, Feb 23, 2021 at 8:30 AM Joao Martins <joao.m.martins@xxxxxxxxxx> wrote:
>>
>> On 2/20/21 1:18 AM, Dan Williams wrote:
>>> On Tue, Dec 8, 2020 at 9:32 AM Joao Martins <joao.m.martins@xxxxxxxxxx> wrote:
>>>> Patch 6 - 8: Optimize grabbing/release a page refcount changes given that we
>>>> are working with compound pages i.e. we do 1 increment/decrement to the head
>>>> page for a given set of N subpages compared as opposed to N individual writes.
>>>> {get,pin}_user_pages_fast() for zone_device with compound pagemap consequently
>>>> improves considerably, and unpin_user_pages() improves as well when passed a
>>>> set of consecutive pages:
>>>>
>>>>                                            before          after
>>>>     (get_user_pages_fast 1G;2M page size) ~75k  us -> ~3.2k ; ~5.2k us
>>>>     (pin_user_pages_fast 1G;2M page size) ~125k us -> ~3.4k ; ~5.5k us
>>>
>>> Compelling!
>>>
>>
>> BTW is there any reason why we don't support pin_user_pages_fast() with FOLL_LONGTERM for
>> device-dax?
>>
> 
> Good catch.
> 
> Must have been an oversight of the conversion. FOLL_LONGTERM collides
> with filesystem operations, but not device-dax. 

hmmmm, fwiw, it was unilaterally disabled for any devmap pmd/pud in commit 7af75561e171
("mm/gup: add FOLL_LONGTERM capability to GUP fast") and I must only assume that
by "DAX pages" the submitter was only referring to fs-dax pages.

> In fact that's the
> motivation for device-dax in the first instance, no need to coordinate
> runtime physical address layout changes because the device is
> statically allocated.
> 
/me nods

>> Looking at the history, I understand that fsdax can't support it atm, but I am not sure
>> that the same holds for device-dax. I have this small chunk (see below the scissors mark)
>> which relaxes this for a pgmap of type MEMORY_DEVICE_GENERIC, albeit not sure if there is
>> a fundamental issue for the other types that makes this an unwelcoming change.
>>
>>         Joao
>>
>> --------------------->8---------------------
>>
>> Subject: [PATCH] mm/gup: allow FOLL_LONGTERM pin-fast for
>>  MEMORY_DEVICE_GENERIC
>>
>> The downside would be one extra lookup in dev_pagemap tree
>> for other pgmap->types (P2P, FSDAX, PRIVATE). But just one
>> per gup-fast() call.
> 
> I'd guess a dev_pagemap lookup is faster than a get_user_pages slow
> path. It should be measurable that this change is at least as fast or
> faster than falling back to the slow path, but it would be good to
> measure.
> 
But with the changes I am/will-be making I hope gup-fast and gup-slow
will be as fast (for present pmd/puds ofc, as the fault makes it slower).

I'll formally submit below patch, once I ran over the numbers.

> Code changes look good to me.
> 
Cool! Will add in the suggested change below.

>>
>> Signed-off-by: Joao Martins <joao.m.martins@xxxxxxxxxx>
>> ---
>>  include/linux/mm.h |  5 +++++
>>  mm/gup.c           | 24 +++++++++++++-----------
>>  2 files changed, 18 insertions(+), 11 deletions(-)
>>
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index 32f0c3986d4f..c89a049bbd7a 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -1171,6 +1171,11 @@ static inline bool is_pci_p2pdma_page(const struct page *page)
>>                 page->pgmap->type == MEMORY_DEVICE_PCI_P2PDMA;
>>  }
>>
>> +static inline bool devmap_longterm_available(const struct dev_pagemap *pgmap)
>> +{
> 
> I'd call this devmap_can_longterm().
> 
Ack.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux