On Tue, Apr 20, 2021, 7:49 PM Song Liu <song@xxxxxxxxxx> wrote: > On Tue, Apr 20, 2021 at 3:05 PM Paul Clements <paul.clements@xxxxxxxxxxx> wrote: > > > > This patch addresses a data corruption bug in raid1 arrays using bitmaps. > > Without this fix, the bitmap bits for the failed I/O end up being cleared. > > I think this only happens when we re-add a faulty drive? Yes, the bitmap gets cleared when the disk is marked faulty or a write error occurs. Then when the disk is re-added, the bitmap-based resync is, of course, not accurate. Is there another way to deal with a transient, transport-based error, other than this? For instance, I'm using nbd as one of the mirror legs. In that case, assuming the failures that lead to the device being marked faulty are just transport/network issues, then we want the resync to be able to correctly deal with this. It has always worked this way since a long time ago. There was a fairly recent commit (eeba6809d8d58908b5ed1b5ceb5fcb09a98a7cad) that re-arranged the code (previously all write failures were retried via flagging with R1BIO_WriteError). Does the patch present a problem in some other scenario? Thanks, Paul