On 10/16/18 1:51 AM, Jan Kara wrote: > On Sun 14-10-18 10:01:24, Dave Chinner wrote: >> On Sat, Oct 13, 2018 at 12:34:12AM -0700, John Hubbard wrote: >>> On 10/12/18 8:55 PM, Dave Chinner wrote: >>>> On Thu, Oct 11, 2018 at 11:00:12PM -0700, john.hubbard@xxxxxxxxx wrote: >>>>> From: John Hubbard <jhubbard@xxxxxxxxxx> >>> [...] >>>>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h >>>>> index 5ed8f6292a53..017ab82e36ca 100644 >>>>> --- a/include/linux/mm_types.h >>>>> +++ b/include/linux/mm_types.h >>>>> @@ -78,12 +78,22 @@ struct page { >>>>> */ >>>>> union { >>>>> struct { /* Page cache and anonymous pages */ >>>>> - /** >>>>> - * @lru: Pageout list, eg. active_list protected by >>>>> - * zone_lru_lock. Sometimes used as a generic list >>>>> - * by the page owner. >>>>> - */ >>>>> - struct list_head lru; >>>>> + union { >>>>> + /** >>>>> + * @lru: Pageout list, eg. active_list protected >>>>> + * by zone_lru_lock. Sometimes used as a >>>>> + * generic list by the page owner. >>>>> + */ >>>>> + struct list_head lru; >>>>> + /* Used by get_user_pages*(). Pages may not be >>>>> + * on an LRU while these dma_pinned_* fields >>>>> + * are in use. >>>>> + */ >>>>> + struct { >>>>> + unsigned long dma_pinned_flags; >>>>> + atomic_t dma_pinned_count; >>>>> + }; >>>>> + }; >>>> >>>> Isn't this broken for mapped file-backed pages? i.e. they may be >>>> passed as the user buffer to read/write direct IO and so the pages >>>> passed to gup will be on the active/inactive LRUs. hence I can't see >>>> how you can have dual use of the LRU list head like this.... >>>> >>>> What am I missing here? >>> >>> Hi Dave, >>> >>> In patch 6/6, pin_page_for_dma(), which is called at the end of get_user_pages(), >>> unceremoniously rips the pages out of the LRU, as a prerequisite to using >>> either of the page->dma_pinned_* fields. >> >> How is that safe? If you've ripped the page out of the LRU, it's no >> longer being tracked by the page cache aging and reclaim algorithms. >> Patch 6 doesn't appear to put these pages back in the LRU, either, >> so it looks to me like this just dumps them on the ground after the >> gup reference is dropped. How do we reclaim these page cache pages >> when there is memory pressure if they aren't in the LRU? > > Yeah, that's a bug in patch 6/6 (possibly in ClearPageDmaPinned). It should > return the page to the LRU from put_user_page(). > Yes. Ugh, the LRU handling in this series is definitely not all there yet: probably need to track (in the page->dma_pinned_flags) which LRU (if any) each page was taken from. It's hard to say exactly what the active/inactive/unevictable list should be when DMA is done and put_user_page*() is called, because we don't know if some device read, wrote, or ignored any of those pages. Although if put_user_pages_dirty() is called, that's an argument for "active", at least. And maybe this will all be pointless if the DIRECT_IO performance test, that Christoph requested, shows that LRU operations are too expensive here, anyway. I wonder if we should just limit this to 64-bit arches and find a real page flag...well, let's see what the testing shows first I suppose. -- thanks, John Hubbard NVIDIA