On Mon, Jul 09, 2018 at 04:10:19PM +0200, Michael Niewöhner wrote: > That would require a - at least in-memory - bitmap, too. Sure, but it would remove the need of keeping a persistent structure with all the overhead and possibility of error. I think there recently was a bug that caused data to not be re-synced in certain cases. Sometimes syncing everything is preferable to optimizing speed and then making a wrong decision because the logic is complex and may ignore some corner cases. > LVM passes trim to the lower layers so no problem here. It would be about querying free space on the fly. There's no standard way to do that. fstrim is not standard either, each filesystem decides how to go about it, some trim all free space, others do it sometimes, or not support it at all. And for LVM, partitions, etc. it's obviously different, and other storage layers are possible... Anyway, wild idea. Bitmap is certainly more in line with what RAID does. As for the backing device idea, perhaps the thin provisioning target (device mapper / LVM) would work too. That's the only thing in kernel that already keeps track of trimmed space that I can think of. Not sure how much overhead that involves, but if you could build md on top of thin-provisioning then query device mapper for used region, that might work too. Just throwing ideas around. My personal setup is also different; I like to slice my drives into partitions of same size to create separate RAIDs with, then merge that together with LVM (each RAID is one PV). So I have several RAIDs like these: md5 : active raid6 sdf5[8] sde5[7] sdh5[5] sdg5[9] sdd5[3] sdc5[2] sdb5[1] 1220692480 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/7] [UUUUUUU] md4 : active raid6 sdf4[8] sde4[7] sdh4[5] sdg4[9] sdd4[3] sdc4[2] sdb4[1] 1220692480 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/7] [UUUUUUU] md3 : active raid6 sdf3[8] sde3[7] sdh3[5] sdg3[9] sdd3[3] sdc3[2] sdb3[1] 1220692480 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/7] [UUUUUUU] That can be used to speed up disk replacements, provided there are entire PV without data, you can just --assume-clean those segments. Or you can decide which PV are most important and sync those first. Of course this is a manual process. But that's super low resolution, as the number of partitions and RAIDs you can run is obviously limited, and each RAID instance comes with metadata offsets that take away usable space. Regards Andreas Klauer -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html