On Aug 13, 2010, at 7:57 , Eric Sandeen wrote:
Just out of curiosity, what do you see when the write cache is on?
Seems counter-intuitive that it'd work better, but talking w/
Ric Wheeler, he was curious... maybe Intel didn't test with the
write cache off?
Data loss is much easier to trigger with the write cache on. It
happens to me on the first try. With the write cache off, I've only
been able to get it to occur with large writes (64 kB or larger), and
only about once every 3 times.
Others have observed data loss with the write cache enabled using
Intel SSDs. However, no one else seems to report data loss with the
cache disabled, which makes me wonder if I am doing something wrong.
With the X25-E:
http://www.mysqlperformanceblog.com/2009/03/02/ssd-xfs-lvm-fsync-write-cache-barrier-and-lost-transactions/
And with the X25-M G2:
http://thread.gmane.org/gmane.os.solaris.opensolaris.zfs/33472
Also, would you be willing to publish the test you're using?
The programs I have been using are here (but see below):
http://people.csail.mit.edu/evanj/hg/index.cgi/hstore/file/tip/logging/minlogcrash.c
http://people.csail.mit.edu/evanj/hg/index.cgi/hstore/file/tip/logging/logfilecrashserver.cc
minlogcrash.c is actually a simplified version of my *real* test
program (below). However, that program has a lot of dependencies and
unrelated crap. Unfortunately, I'm away from my hardware for the next
10 days or so, so minlogcrash has not actually been crash tested. I
think it should be equivalent, but just in case, the crash tested
version is here:
http://people.csail.mit.edu/evanj/hg/index.cgi/hstore/file/tip/logging/logfilecrash.cc
My test procedure:
1. Start logfilecrashserver on a workstation:
./logfilecrashserver 12345
2. Start minlogcrash on the system under test (using large writes is
more likely to lose data: 128 kB or so):
./minlogcrash tmp workstation 12345 131072
3. Once the workstation starts receiving log records, pull the power
from the back of the SSD.
4. Power off the system (my system doesn't support hotplug, so losing
the power on the SSD makes it unhappy)
5. Reconnected power to the SSD.
6. Power the server back on.
7. Observe the output of logfilecrash using hexdump.
You should find that the file has *at least* the last record reported
by logfilecrashserver. It may have (part of) the next record. Error
modes I have observed: it is missing the last reported record
entirely; it has a truncated record; occasionally I get some sort of
media error in the kernel and I can't read the entire file.
Finally full disclosure: I tested this a lot more with the Intel SSD
than with my magnetic disks. With the magnetic disks and barrier=0, I
was able to very easily see "lost writes", but with barrier=1 it
seemed to work. However, I still need to go back and re-test the
magnetic disks multiple times, to ensure they are behaving the way I
expect.
Evan
--
Evan Jones
http://evanjones.ca/
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html