Hi everyone, Several months ago, there was a discussion[1] about enhancing XFS to take a more active role in recoverying damaged blocks from a redundant storage device when the block device doesn't signal an error but the filesystem can tell that something is wrong. Yes, we (XFS) would like to be able to exhaust all available storage redundancy before we resort to rebuilding lost metadata, and we'd like to do that without implementing our own RAID layer. In the end, the largest stumbling block seems to be how to attach additional instructions to struct bio. Jens rejected the idea of adding more pointers or more bytes to a struct bio since we'd be forcing everyone to pay the extra memory price for a feature that in the ideal situation will be used infrequently. I think Martin Petersen tried to introduce separate bio pools so that we only end up using larger bios for devices that really need it, but ran into some difficulty with the usage model for how that would work. (We could, in theory, need to attach integrity data *and* retry attributes to the same disk access). So I propose a discussion of what exactly are the combinations of bio attributes that are needed by block layer callers. IIRC, the DIF/DIX support code need to be able to attach the integrity data on its own; whereas XFS already knows which device and which replica it would like to try. If the storage isn't total crap it shouldn't need to use the feature all that often. While we're on the topic of replica selection and discovery, let's also bikeshed how to figure out how many replicas are even available. (Yes, yes, the crazydragon rears his head again...;) --D [1] https://lore.kernel.org/linux-block/1543376991-5764-1-git-send-email-allison.henderson@xxxxxxxxxx/