Re: Why does one get mismatches?

Neil Brown <neilb@xxxxxxx> · Sat, 27 Feb 2010 08:09:38 +1100

On Fri, 26 Feb 2010 15:48:58 -0500
Bill Davidsen <davidsen@xxxxxxx> wrote:

> >
> > The idea of calculating a checksum before and after certainly has some merit,
> > if we could choose a checksum algorithm which was sufficiently strong and
> > sufficiently fast, though in many cases a large part of the cost would just be
> > bringing the page contents into cache - twice.
> >
> > It has the advantage over copying the page of not needing to allocate extra
> > memory.
> >
> > If someone wanted to try an prototype this and see how it goes, I'd be happy
> > to advise....
> >     
> 
> Disagree if you wish, but MD5 should be fine for this. While it is not 
> cryptographically strong on files, where the size can be changed and 
> evil doers can calculate values to add at the end of the data, it should 
> be adequate on data of unchanging size. It's cheap, fast, and readily 
> available.
> 

Actually, I'm no longer convinced that the checksumming idea would work.
If a mem-mapped page were written, that the app is updating every
millisecond (i.e. less than the write latency), then every time a write
completed the checksum would be different so we would have to reschedule the
write, which would not be the correct behaviour at all.
So I think that the only way to address this in the md layer is to copy
the data and write the copy.  There is already code to copy the data for
write-behind that could possible be leveraged to do a copy always.

Or I could just stop setting mismatch_cnt for raid1 and raid10.  That would
also fix the problem :-)

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html