Re: [patch 04/15] shmem: shmem_writepage() split unlikely i915 THP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Matthew,

Sorry for late reply.  I just come back from a long holiday.

Matthew Wilcox <willy@xxxxxxxxxxxxx> writes:

> On Sat, Sep 19, 2020 at 05:18:47PM +0100, Matthew Wilcox wrote:
>> On Fri, Sep 18, 2020 at 10:44:32PM -0700, Hugh Dickins wrote:
>> > It behaves a lot better with this patch in than without it; but you're
>> > right, only the head will get written to swap, and the tails left in
>> > memory; with dirty cleared, so they may be left indefinitely (I've
>> > not yet looked to see when if ever PageDirty might get set later).
>> > 
>> > Hmm. It may just be a matter of restyling the i915 code with
>> > 
>> > 		if (!page_mapped(page)) {
>> > 			clear_page_dirty_for_io(page);
>> > 
>> > but I don't want to rush to that conclusion - there might turn
>> > out to be a good way of doing it at the shmem_writepage() end, but
>> > probably only hacks available.  I'll mull it over: it deserves some
>> > thought about what would suit, if a THP arrived here some other way.
>> 
>> I think the ultimate solution is to do as I have done for iomap and make
>> ->writepage handle arbitrary sized pages.  However, I don't know the
>> swap I/O path particularly well, and I would rather not learn it just yet.
>> 
>> How about this for a band-aid until we sort that out properly?  Just mark
>> the page as dirty before splitting it so subsequent iterations see the
>> subpages as dirty.  Arguably, we should use set_page_dirty() instead of
>> SetPageDirty, but I don't think i915 cares.  In particular, it uses
>> an untagged iteration instead of searching for PAGECACHE_TAG_DIRTY.
>> 
>> diff --git a/mm/shmem.c b/mm/shmem.c
>> index 271548ca20f3..6231207ab1eb 100644
>> --- a/mm/shmem.c
>> +++ b/mm/shmem.c
>> @@ -1362,8 +1362,21 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc)
>>  	swp_entry_t swap;
>>  	pgoff_t index;
>>  
>> -	VM_BUG_ON_PAGE(PageCompound(page), page);
>>  	BUG_ON(!PageLocked(page));
>> +
>> +	/*
>> +	 * If /sys/kernel/mm/transparent_hugepage/shmem_enabled is "force",
>> +	 * then drivers/gpu/drm/i915/gem/i915_gem_shmem.c gets huge pages,
>> +	 * and its shmem_writeback() needs them to be split when swapping.
>> +	 */
>> +	if (PageTransCompound(page)) {
>> +		/* Ensure the subpages are still dirty */
>> +		SetPageDirty(page);
>> +		if (split_huge_page(page) < 0)
>> +			goto redirty;
>> +		ClearPageDirty(page);
>> +	}
>> +
>>  	mapping = page->mapping;
>>  	index = page->index;
>>  	inode = mapping->host;
>
> It turns out that I have an entirely different reason for wanting
> ->writepage to handle an unsplit page.  In vmscan.c:shrink_page_list(),
> we currently try to split file-backed THPs.  This always fails for XFS
> file-backed THPs because they have page_private set which increments
> the refcount by 1.  And so we OOM when the page cache is full of XFS
> THPs.  I've been running successfully for a few days with this patch:
>
> @@ -1271,10 +1271,6 @@ static unsigned int shrink_page_list(struct list_head *page_list,
>                                 /* Adding to swap updated mapping */
>                                 mapping = page_mapping(page);
>                         }
> -               } else if (unlikely(PageTransHuge(page))) {
> -                       /* Split file THP */
> -                       if (split_huge_page_to_list(page, page_list))
> -                               goto keep_locked;
>                 }
>  
>                 /*
>
>
> Kirill points out that this will probably make shmem unhappy (it's
> possible that said pages will get split anyway if they're mapped
> because we pass TTU_SPLIT_HUGE_PMD into try_to_unmap()), but if
> they're (a) Dirty, (b) !mapped, we'll call pageout() which calls
> ->writepage().

We may distinguish the shmem THPs from the XFS file cache THPs via
PageSwapBacked()?

Best Regards,
Huang, Ying

> The patch above is probably not exactly the right solution for this
> case, since pageout() calls writepage only once, not once for each
> sub-page.  This is hard to write a cute patch for because the
> pages get unlocked by split_huge_page().  I think I'm going to have
> to learn about the swap path, unless someone can save me from that.



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux