Re: [PATCH] mm/thp: flush file for !is_shmem PageDirty() case in collapse_file()

Song Liu <songliubraving@xxxxxx> · Thu, 31 Oct 2019 18:10:27 +0000

Resending the mail, as I sent previous one in "Rich Text". 

> On Oct 30, 2019, at 6:51 PM, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> 
> On Wed, 30 Oct 2019 13:07:36 -0700 Song Liu <songliubraving@xxxxxx> wrote:
> 
>> For non-shmem file THPs, khugepaged only collapses read only .text mapping
>> (VM_DENYWRITE). These pages should not be dirty except the case where the
>> file hasn't been flushed since first write.
>> 
>> Call filemap_flush() in collapse_file() to accelerate the write back in
>> such cases.
>> 
>> Also add warning if PageDirty() triggered for pages from readahead path.
>> 
>> Reported-and-tested-by: syzbot+efb9e48b9fbdc49bb34a@xxxxxxxxxxxxxxxxxxxxxxxxx
> 
> If sysbot reported something then we definitely want to be able to
> review that report when reviewing the patch!  So please do include a
> Link: for that sort of thing.
> 
> A bit of sleuthing leads me to
> http://lkml.kernel.org/r/000000000000c50fd70595fdd5b2@xxxxxxxxxx
> 
> which shows that this patch is in fact a fix against
> mmthp-recheck-each-page-before-collapsing-file-thp.patch.  This very
> important information wasn't in the changelog.
> 
> And this patch doesn't really fix the sysbot hang, does it?  The patch
> restores the flush which was removed in order to fix the sysbot hang.
> 
> In general, please do try to include all this sort of information when
> preparing changelogs.

Sorry for the confusion. This one is a bit tricky.

syzbot reported hang issue with filemap_flush(), which is already fixed
by previous patch which removes filemap_flush(). 

This patch tries to add filemap_flush() back, in a proper way. So this 
patch is an improvement, not a fix. That's why I didn't include the 
fixes tag. 

However, I used syzbot to test this patch, by replying "#syz test: ...". 
to the previous report (the one you found). I didn't cc the mail list
for those tests. syzbot did find bug with an earlier version, and gave 
green light for this version. This is the reason I would like to give 
syzbot credit with the tag. Maybe I should just make it "Tested-by: 
syzbot"? 

> 
> 
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -1601,6 +1601,33 @@ static void collapse_file(struct mm_struct *mm,
>> 					result = SCAN_FAIL;
>> 					goto xa_unlocked;
>> 				}
>> +				if (WARN_ON_ONCE(PageDirty(page))) {
> 
> I'm not understanding what guarantees this.  Can't another process
> which has the file open for writing come in and dirty the page after
> the readahead has completed and before this process locks the page?

For non-shmem file, khugepaged only work for VM_DENYWRITE mappings. 
Therefore, the page cannot be opened for write by another process.

> 
>> +					/*
>> +					 * page from readahead should not
>> +					 * be dirty. Show warning if this
>> +					 * somehow happens.
>> +					 */
>> +					result = SCAN_FAIL;
>> +					goto out_unlock;
>> +				}
>> +			} else if (PageDirty(page)) {
>> +				/*
>> +				 * khugepaged only works on read-only fd,
>> +				 * so this page is dirty because it hasn't
>> +				 * been flushed since first write. There
>> +				 * won't be new dirty pages.
>> +				 *
>> +				 * Trigger async flush here and hope the
>> +				 * writeback is done when khugepaged
>> +				 * revisits this page.
>> +				 *
>> +				 * This is a one-off situation. We are not
>> +				 * forcing writeback in loop.
>> +				 */
>> +				xas_unlock_irq(&xas);
>> +				filemap_flush(mapping);
>> +				result = SCAN_FAIL;
>> +				goto xa_unlocked;
>> 			} else if (trylock_page(page)) {
>> 				get_page(page);
>> 				xas_unlock_irq(&xas);
> 
> The patch mmthp-recheck-each-page-before-collapsing-file-thp.patch has
> undergone quite a bit of churn so I don't think it should be mainlined
> without more testing and review.  But it fixes a significant issue.  So
> could the appropriate developers please take some time to recheck and
> retest it all?

We are running production tests with previous version (no filemap_flush). 
syzbot also gives green light to that one. Maybe we should ship the 
version without filemap_flush with 5.4, and this improvement with 5.5?

This improvement also passed syzbot's test. If this looks good, we will 
also add this to production tests. 

Thanks,
Song