Hello. As I remember Tejun asked me to change the comment to this commit and to clarify things little bit that WRITE_SYNC is not related to integrity write at all, the key is that a caller waits for completion and block layer or block device should prioritize the IO. >From the original comment it is not evident, so I sent v2 of this patch with the following comment (I hope it clarifies things): --- When data integrity operation happens (sync, fsync, fdatasync calls) writeback control is set to WB_SYNC_ALL. In that case all write requests are marked with WRITE_SYNC (WRITE | REQ_SYNC | REQ_NOIDLE) indicating that caller is waiting for completion and block layer or block device should prioritize the IO avoiding any possible delays. But mpage writeback path ignores marking requests as WRITE_SYNC. This patch fixes this. --- If you are going to take this patch could you please take the v2 comment also? -- Roman On Tue, Dec 16, 2014 at 8:03 AM, <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > From: Roman Pen <r.peniaev@xxxxxxxxx> > Subject: fs/mpage.c: forgotten WRITE_SYNC in case of data integrity write > > In case of wbc->sync_mode == WB_SYNC_ALL we need to do data integrity > write, thus mark request as WRITE_SYNC. > > akpm: afaict this change will cause the data integrity write bios to be > placed onto the second queue in cfq_io_cq.cfqq[], which presumably results > in special treatment. The documentation for REQ_SYNC is horrid. > > Signed-off-by: Roman Pen <r.peniaev@xxxxxxxxx> > Reviewed-by: Jan Kara <jack@xxxxxxx> > Cc: Jens Axboe <axboe@xxxxxxxxx> > Cc: Tejun Heo <tj@xxxxxxxxxx> > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > --- > > fs/mpage.c | 23 +++++++++++++++-------- > 1 file changed, 15 insertions(+), 8 deletions(-) > > diff -puN fs/mpage.c~fs-mpagec-forgotten-write_sync-in-case-of-data-integrity-write fs/mpage.c > --- a/fs/mpage.c~fs-mpagec-forgotten-write_sync-in-case-of-data-integrity-write > +++ a/fs/mpage.c > @@ -482,6 +482,7 @@ static int __mpage_writepage(struct page > struct buffer_head map_bh; > loff_t i_size = i_size_read(inode); > int ret = 0; > + int wr = (wbc->sync_mode == WB_SYNC_ALL ? WRITE_SYNC : WRITE); > > if (page_has_buffers(page)) { > struct buffer_head *head = page_buffers(page); > @@ -590,7 +591,7 @@ page_is_mapped: > * This page will go to BIO. Do we need to send this BIO off first? > */ > if (bio && mpd->last_block_in_bio != blocks[0] - 1) > - bio = mpage_bio_submit(WRITE, bio); > + bio = mpage_bio_submit(wr, bio); > > alloc_new: > if (bio == NULL) { > @@ -614,7 +615,7 @@ alloc_new: > */ > length = first_unmapped << blkbits; > if (bio_add_page(bio, page, length, 0) < length) { > - bio = mpage_bio_submit(WRITE, bio); > + bio = mpage_bio_submit(wr, bio); > goto alloc_new; > } > > @@ -624,7 +625,7 @@ alloc_new: > set_page_writeback(page); > unlock_page(page); > if (boundary || (first_unmapped != blocks_per_page)) { > - bio = mpage_bio_submit(WRITE, bio); > + bio = mpage_bio_submit(wr, bio); > if (boundary_block) { > write_boundary_block(boundary_bdev, > boundary_block, 1 << blkbits); > @@ -636,7 +637,7 @@ alloc_new: > > confused: > if (bio) > - bio = mpage_bio_submit(WRITE, bio); > + bio = mpage_bio_submit(wr, bio); > > if (mpd->use_writepage) { > ret = mapping->a_ops->writepage(page, wbc); > @@ -692,8 +693,11 @@ mpage_writepages(struct address_space *m > }; > > ret = write_cache_pages(mapping, wbc, __mpage_writepage, &mpd); > - if (mpd.bio) > - mpage_bio_submit(WRITE, mpd.bio); > + if (mpd.bio) { > + int wr = (wbc->sync_mode == WB_SYNC_ALL ? > + WRITE_SYNC : WRITE); > + mpage_bio_submit(wr, mpd.bio); > + } > } > blk_finish_plug(&plug); > return ret; > @@ -710,8 +714,11 @@ int mpage_writepage(struct page *page, g > .use_writepage = 0, > }; > int ret = __mpage_writepage(page, wbc, &mpd); > - if (mpd.bio) > - mpage_bio_submit(WRITE, mpd.bio); > + if (mpd.bio) { > + int wr = (wbc->sync_mode == WB_SYNC_ALL ? > + WRITE_SYNC : WRITE); > + mpage_bio_submit(wr, mpd.bio); > + } > return ret; > } > EXPORT_SYMBOL(mpage_writepage); > _ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>