Re: mismatch_cnt again

Neil Brown <neilb@xxxxxxx> · Fri, 13 Nov 2009 13:15:41 +1100

On Tuesday November 10, davidsen@xxxxxxx wrote:
> NeilBrown wrote:
> 
> > You could possibly argue that it is a weakness in the interface to block
> > devices that the block device cannot ask for the buffer to be guaranteed
> > to be stable for the duration of the write, but as there is little real
> > need for that and it would probably be fairly hard to implement both
> > efficiently and generally.
> >
> >   
> The raid code would need it's own copy of the data in a private buffer, 
> or would have to mark the write memory as copy on write. I suspect the 
> 2nd if far more efficient, but I have no idea how hard it would be to 
> implement.

Copy-on-write is not actually possible for md to enforce - it is at
the wrong layer and knows nothing about who owns the page of how or
where it is mapped.
A filesystem can impose copy-on-write, a block device cannot.
I gather from odd comments that I have seen that copy-on-write is
rather expensive.  Marking a thousand contiguous pages copy-on-write
is much faster than copy one thousand pages.  Making a single page
copy-on-write may not be much faster than copying the page.
However I'm not 100% certain of these details.

Maybe if the filesystem could set a flag in the bio saying "this page
will not  change until the write completes", then md could optimise
that case and do copies in other cases...

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html