Re: RFC: detection of silent corruption via ATA long sector reads

John Robinson <john.robinson@xxxxxxxxxxxxxxxx> · Sun, 04 Jan 2009 13:49:23 +0000

On 04/01/2009 12:31, John Robinson wrote:
On 04/01/2009 07:37, Martin K. Petersen wrote:
[...]
We also don't want to do checksumming at every layer.  That's going to
suck from a performance perspective.  It's better to do checksumming
high up in the stack and only do it once.  As long as we give the upper
layers the option of re-driving the I/O.

That involves adding a cookie to each bio that gets filled out by DM/MD
on completion.  If the filesystem checksum fails we can resubmit the I/O
and pass along the cookie indicating that we want a different copy than
the one the cookie represents.

I'd like to understand this mechanism better; at first glance it's 
either going to be too simplistic and not cover the various block layer 
cases well, or it means you end up re-implementing RAID and LVM in the 
filesystem.

I've thought about this again, and I'm wrong; there may be complications 
in handling the cookies up and down the stack where more than one layer 
thinks it knows how to have another go, but I can see what you describe 
as being useful and relatively device-agnostic.

I wonder if there might also be scope for cookies going down through the 
stack to carry an indication of how hard to try; some filesystems or 
other consumers of block devices may be willing to ask again or want to 
be told about problems quickly (e.g. btrfs over RAID over TLER-equipped 
discs), while some may need best efforts all out first time because they 
can't cope will failure returns (e.g. FAT over cheap IDE discs).

Anyway, I think I'd better leave all this to the experts i.e. you :-)

Cheers,

John.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html