Neil Brown <neilb@xxxxxxx> writes: > (*) I've been wondering about adding another bitmap which would record > which sections of the array have valid data. Initially nothing would > be valid and so wouldn't need recovery. Every time we write to a new > section we add that section to the 'valid' sections and make sure that > section is in-sync. > When a device was replaced, we would only need to recover the parts of > the array that are known to be invalid. > As filesystem start using the new "invalidate" command for block > devices, we could clear bits for sections that the filesystem says are > not needed any more... > But currently it is just a vague idea. > > NeilBrown If you are up for experimenting I would go for a completly new approach. Instead of working with physical blocks and marking where blocks are used and out of sync how about adding a mapping layer on the device and using virtual blocks. You reduce the reported disk size by maybe 1% to always have some spare blocks and initialy all blocks will be unmapped (unused). Then whenever there is a write you pick out an unused block, write to it and change the in memory mapping of the logical to physical block. Every X seconds, on a barrier or an sync you commit the mapping from memory to disk in such a way that it is synchronized between all disks in the raid. So every commited mapping represents a valid raid set. After the commit of the mapping all blocks changed between the mapping and the last can be marked as free again. Better use the second last so there are always 2 valid mappings to choose from after a crash. This would obviously need a lot more space than a bitmap but space is (relatively) cheap. One benefit imho should be that sync/barrier would not have to stop all activity on the raid to wait for the sync/barrier to finish. It just has to finalize the mapping for the commit and then can start a new in memory mapping while the finalized one writes to disk. Just some thoughts, Goswin -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html