On Fri, Feb 26, 2010 at 2:20 PM, Asdo <asdo@xxxxxxxxxxxxx> wrote: > Neil Brown wrote: >> >> Actually, I'm no longer convinced that the checksumming idea would work. >> If a mem-mapped page were written, that the app is updating every >> millisecond (i.e. less than the write latency), then every time a write >> completed the checksum would be different so we would have to reschedule >> the >> write, which would not be the correct behaviour at all. >> So I think that the only way to address this in the md layer is to copy >> the data and write the copy. There is already code to copy the data for >> write-behind that could possible be leveraged to do a copy always. >> > > The concerns of slowdowns with copy could be addressed by making the copy a > runtime choice triggered by a sysctl interface, a file in /sys/block/mdX/md/ > interface where one can echo "1" to enable copies for this type of raid. Or > better 1 could be the default (slower but safer, or if not safer, at least > to avoid needless questions on mismatches on this ML by new users, and to > allow detection of REAL mismatches which can be due to cabling or defective > disks) and echoing 0 would increase performances at the cost of seeing lots > of false positive mismatches. > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > Isn't there some way of making the page copy-on-write using hardware and/or an in-kernel structure? Ideally copying could be avoided /unless/ there is change. That way each operation looks like an atomic commit. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html