Re: Why does one get mismatches?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Neil Brown wrote:
On Fri, 26 Feb 2010 15:48:58 -0500
Bill Davidsen <davidsen@xxxxxxx> wrote:

The idea of calculating a checksum before and after certainly has some merit,
if we could choose a checksum algorithm which was sufficiently strong and
sufficiently fast, though in many cases a large part of the cost would just be
bringing the page contents into cache - twice.

It has the advantage over copying the page of not needing to allocate extra
memory.

If someone wanted to try an prototype this and see how it goes, I'd be happy
to advise....
Disagree if you wish, but MD5 should be fine for this. While it is not cryptographically strong on files, where the size can be changed and evil doers can calculate values to add at the end of the data, it should be adequate on data of unchanging size. It's cheap, fast, and readily available.


Actually, I'm no longer convinced that the checksumming idea would work.
If a mem-mapped page were written, that the app is updating every
millisecond (i.e. less than the write latency), then every time a write
completed the checksum would be different so we would have to reschedule the
write, which would not be the correct behaviour at all.
So I think that the only way to address this in the md layer is to copy
the data and write the copy.  There is already code to copy the data for
write-behind that could possible be leveraged to do a copy always.

Your point is valid about the possibility, but consider this, if the checksum fails, then at that point do the copy and write again.
Or I could just stop setting mismatch_cnt for raid1 and raid10.  That would
also fix the problem :-)

s/fix/hide/  ;-)

My feeling is that we have many ways to change the data, O_DIRECT, aio, threads, mmap, and probably some I haven't found yet. Rather than think that you could prevent that without a flaming layer violation, perhaps my thought above, to detect the fact that the data has changed, and at that point do a copy and write unchanging data to all drives. How that plays with O_DIRECT I can't say, but it sounds to me as if it should eliminate the mismatches without a huge performance impact. Let me know if this addresses your concern with writing forever without taking much overhead.

The question is why this happens with raid-1 and doesn't seem to with raid-[56]. And I don't see mismatches on my raid-10, although I'm pretty sure that neither mmap or O_DIRECT is used on those arrays.

What would seem to be optimal is some COW on the buffer to prevent the buffer from being modified while it's being used for actual i/o. Doesn't seem hardware supports it, page size, buffer size and sector size all vary.

--
Bill Davidsen <davidsen@xxxxxxx>
 "We can't solve today's problems by using the same thinking we
  used in creating them." - Einstein

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux