On Tuesday November 10, davidsen@xxxxxxx wrote: > NeilBrown wrote: > > > You could possibly argue that it is a weakness in the interface to block > > devices that the block device cannot ask for the buffer to be guaranteed > > to be stable for the duration of the write, but as there is little real > > need for that and it would probably be fairly hard to implement both > > efficiently and generally. > > > > > The raid code would need it's own copy of the data in a private buffer, > or would have to mark the write memory as copy on write. I suspect the > 2nd if far more efficient, but I have no idea how hard it would be to > implement. Copy-on-write is not actually possible for md to enforce - it is at the wrong layer and knows nothing about who owns the page of how or where it is mapped. A filesystem can impose copy-on-write, a block device cannot. I gather from odd comments that I have seen that copy-on-write is rather expensive. Marking a thousand contiguous pages copy-on-write is much faster than copy one thousand pages. Making a single page copy-on-write may not be much faster than copying the page. However I'm not 100% certain of these details. Maybe if the filesystem could set a flag in the bio saying "this page will not change until the write completes", then md could optimise that case and do copies in other cases... NeilBrown -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html