[LSF/MM/BPF TOPIC] selectively cramming things onto struct bio

"Darrick J. Wong" <darrick.wong@xxxxxxxxxx> · Thu, 30 Jan 2020 16:44:47 -0800

Hi everyone,

Several months ago, there was a discussion[1] about enhancing XFS to
take a more active role in recoverying damaged blocks from a redundant
storage device when the block device doesn't signal an error but the
filesystem can tell that something is wrong.

Yes, we (XFS) would like to be able to exhaust all available storage
redundancy before we resort to rebuilding lost metadata, and we'd like
to do that without implementing our own RAID layer.

In the end, the largest stumbling block seems to be how to attach
additional instructions to struct bio.  Jens rejected the idea of adding
more pointers or more bytes to a struct bio since we'd be forcing
everyone to pay the extra memory price for a feature that in the ideal
situation will be used infrequently.

I think Martin Petersen tried to introduce separate bio pools so that we
only end up using larger bios for devices that really need it, but ran
into some difficulty with the usage model for how that would work.  (We
could, in theory, need to attach integrity data *and* retry attributes
to the same disk access).

So I propose a discussion of what exactly are the combinations of bio
attributes that are needed by block layer callers.  IIRC, the DIF/DIX
support code need to be able to attach the integrity data on its own;
whereas XFS already knows which device and which replica it would like
to try.  If the storage isn't total crap it shouldn't need to use the
feature all that often.

While we're on the topic of replica selection and discovery, let's also
bikeshed how to figure out how many replicas are even available.

(Yes, yes, the crazydragon rears his head again...;)

--D

[1] https://lore.kernel.org/linux-block/1543376991-5764-1-git-send-email-allison.henderson@xxxxxxxxxx/