> -----Original Message----- > From: Theodore Ts'o [mailto:tytso@xxxxxxx] > Sent: Sunday, March 13, 2016 12:27 PM > To: Jan Kara <jack@xxxxxxx> > Cc: HUANG Weller (CM/ESW12-CN) <Weller.Huang@xxxxxxxxxxxx>; linux- > ext4@xxxxxxxxxxxxxxx; Li, Michael <huayil@xxxxxxxxxxxxxxxx> > Subject: Re: ext4 out of order when use cfq scheduler > > On Thu, Jan 07, 2016 at 12:47:36PM +0100, Jan Kara wrote: > > > > The problem is in all kernels starting with 3.8. Attached is a patch > > which should fix the issue. Can you test whether it fixes the problem for you? > > Sorry, I missed this patch because it was attached to an discussion thread. > > > The problem is that although for delayed allocated blocks we write > > their contents immediately after allocating them, there is no > > guarantee that the IO scheduler or device doesn't reorder things > > I don't think that's the problem. In the commit thread when we call > blkdev_issue_flush() that acts as a barrier so the I/O scheduler won't reorder writes > after that point, which is before we write the commit block. Instead, I believe the > problem is in ext4_writepages: > > ext4_journal_stop(handle); > /* Submit prepared bio */ > ext4_io_submit(&mpd.io_submit); > > Once we release the handle, the commit can start --- *before* we have > a chance to submit the I/O. Oops. > > I believe if we swap these two calls, it should fix the problem Huang was seeing. > > Jan, do you agree? > > - Ted Hi Ted and Jan, You can give me a patch and I can redo the verification on my kernel and HWs. I also look into the code, since In my test case, I use data=ordered option and without sync. So the write operation will goto ext4_da_writepages(), right ? My kernel version is 3.10.63, as I see io_submit and journal_stop sequence already in that order. while (!ret && wbc->nr_to_write > 0) { ext4_journal_start write_cache_pages_da mpage_da_map_and_submit ==> ext4_journal_stop } Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html