Re: async commit & write barrier code

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Theodore Tso wrote:
On Tue, Sep 23, 2008 at 03:41:02PM -0500, Eric Sandeen wrote:
I agree; with async commit, ext4/jbd2 is running with *no* barrier
writes in jbd code. (FWIW, on the fsync front, fsync calls
blkdev_issue_flush in ext4 so that part may actually be ok in the end).

But at a minimum, I think that for data=ordered, there is now *no*
guarantee that the associated file data actually hits disk before the
size updates, is there?

I think the theory behind this was that the journal checksums would
protect us against misordered writes.  But yes, this means that we
would effectively have data=writeback, and not data=ordered.  More
seriously, when I started using my root filesystem with async commit,
when the system crashed after suspend/resumes, I was seeing filesystem
corruptions which caused data loss and which required e2fsck to fix.
I've commented the patch out of the series file for now, until we can
do some more testing of async commit.

							- Ted

I think that is definitely the right thing to do at this point. In addition to testing, we should try to be very clear on how async interacts with barriers, data integrity, etc.

What worries me is how arbitrary the semantics can be given that storage devices (without flush or similar operations) can totally reorder IO requests. Specific worries include things like medium to large IO's can often bypass the write cache entirely, using the write cache itself only for small writes. That means that those small writes associated with the commit record can stay around for a long time in volatile write cache memory and go away on power loss (or suspend to disk!).

What are the basic assumptions (wish lists?) that we have for ordering and persistence of the write sequence for our existing journal code?

Ric

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux