On 28/10/16 12:52, Kent Overstreet wrote:
That's not what the raid 5 hole is. The raid 5 hole comes from the fact that it's not possible to update the p/q blocks atomically with the data blocks, thus there is a point in time when they are _inconsistent_ with the rest of the stripe, and if used will lead to reconstructing incorrect data. There's no way to fix this with just flushes.
Yes, I understand this, but if the kernel strictly orders writing mdraud data blocks before parity ones, then it closes part of the hole, especially if I have a "journal" in a higher layer, and of course ensure that this journal is reliable.
I think that, in the case of a drive failure, which contains data blocks which have been written, but which the parity blocks have not been, then this will fail.
I also think, however, that by putting bcache /under/ mdraid, and (again) ensuring that the bcache layer is reliable, along with the requirement for bcache to "journal" all writes, would provide an extremely reliable storage layer, even at a very large scale.
James -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html