On Sun, 2018-07-08 at 19:23 +0100, Wols Lists wrote: > The problem with this is that (a) MD has no idea which bits of the disk > are in use and which are not, and (b) just copying the bits that are in > use will leave the unused bits in an unsync'd state, which will then > moan blue murder on a check. > > It would make sense to prioritise that used stuff, but you do need to > copy everything. If, however, we could get TRIM to actually zero the > disk, that might well speed things up. It will need a lot of thinking > through, though. It's a lot trickier than it sounds. TRIM does not guarantee that the block is zero'ed physically on disk. We would not even want this because it would waste time. There are disks with DRAT/RZAT feature that guarantee to return zero for previously cleared/trimmed blocks but not all disks support that. Some disks even return random data for trimmed blocks. I think both problems can be solved by keeping track of used blocks by upper layer in a bitmap-like structure in metadata. That bitmap needs to be redundant just as the data. I don't know from memory if metadata is redundant or per-disk. raid-bitmap / write-intent bitmap does something like that but in-memory - we need our bitmap to be on disk(s). When one of the disks gets replaced first the bitmap has to be synced and then the raid data based on that bitmap. An array check would simply ignore unused out-of-sync blocks. Every read and write to such an bitmap-based raid array would need to check/alter the bitmap. One problem I see is that every write will mean two writes: bitmap and data. Maybe the bitmap could be hold in-memory and synced to disk periodically e.g. every 5 seconds? Other ideas welcome.. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html