On Tue 06-05-14 15:49:29, Namjae Jeon wrote: > When we perform a data integrity sync we tag all the dirty pages with > PAGECACHE_TAG_TOWRITE at start of ext4_da_writepages. > Later we check for this tag in write_cache_pages_da and creates a > struct mpage_da_data containing contiguously indexed pages tagged with this > tag and sync these pages with a call to mpage_da_map_and_submit. > This process is done in while loop until all the PAGECACHE_TAG_TOWRITE pages > are synced. We also do journal start and stop in each iteration. > journal_stop could initiate journal commit which would call ext4_writepage > which in turn will call ext4_bio_write_page even for delayed OR unwritten > buffers. When ext4_bio_write_page is called for such buffers, even though it > does not sync them but it clears the PAGECACHE_TAG_TOWRITE of the corresponding > page and hence these pages are also not synced by the currently running data > integrity sync. We will end up with dirty pages although sync is completed. > > This could cause a potential data loss when the sync call is followed by a > truncate_pagecache call, which is exactly the case in collapse_range. > (It will cause generic/127 failure in xfstests) > > Cc: Jan Kara <jack@xxxxxxx> > Signed-off-by: Namjae Jeon <namjae.jeon@xxxxxxxxxxx> > Signed-off-by: Ashish Sangwan <a.sangwan@xxxxxxxxxxx> Just one comment below: ... > diff --git a/mm/page-writeback.c b/mm/page-writeback.c > index 023cf08..f7358c5 100644 > --- a/mm/page-writeback.c > +++ b/mm/page-writeback.c > @@ -2438,6 +2438,43 @@ int test_set_page_writeback(struct page *page) > } > EXPORT_SYMBOL(test_set_page_writeback); > > +int test_set_page_writeback_keepwrite(struct page *page) > +{ > + struct address_space *mapping = page_mapping(page); > + int ret; > + bool locked; > + unsigned long memcg_flags; > + > + mem_cgroup_begin_update_page_stat(page, &locked, &memcg_flags); > + if (mapping) { > + struct backing_dev_info *bdi = mapping->backing_dev_info; > + unsigned long flags; > + > + spin_lock_irqsave(&mapping->tree_lock, flags); > + ret = TestSetPageWriteback(page); > + if (!ret) { > + radix_tree_tag_set(&mapping->page_tree, > + page_index(page), > + PAGECACHE_TAG_WRITEBACK); > + if (bdi_cap_account_writeback(bdi)) > + __inc_bdi_stat(bdi, BDI_WRITEBACK); > + } > + if (!PageDirty(page)) > + radix_tree_tag_clear(&mapping->page_tree, > + page_index(page), > + PAGECACHE_TAG_DIRTY); > + spin_unlock_irqrestore(&mapping->tree_lock, flags); > + } else { > + ret = TestSetPageWriteback(page); > + } > + if (!ret) > + account_page_writeback(page); > + mem_cgroup_end_update_page_stat(page, &locked, &memcg_flags); > + return ret; > + > +} > +EXPORT_SYMBOL(test_set_page_writeback_keepwrite); > + Since the two variants of test_set_page_writeback() differ only a little I'd rather have __test_set_page_writeback(page, keep_write) and then define test_set_page_writeback() and test_set_page_writeback_keepwrite() as calls to that function. Other than that the patch is OK. Thanks! Honza -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html