On Wed, Apr 21, 2021 at 10:38 AM Paul Clements <paul.clements@xxxxxxxxxxx> wrote: > > On Tue, Apr 20, 2021, 7:49 PM Song Liu <song@xxxxxxxxxx> wrote: > > On Tue, Apr 20, 2021 at 3:05 PM Paul Clements <paul.clements@xxxxxxxxxxx> wrote: > > > > > > This patch addresses a data corruption bug in raid1 arrays using bitmaps. > > > Without this fix, the bitmap bits for the failed I/O end up being cleared. > > > > I think this only happens when we re-add a faulty drive? > > Yes, the bitmap gets cleared when the disk is marked faulty or a write > error occurs. Then when the disk is re-added, the bitmap-based resync > is, of course, not accurate. > > Is there another way to deal with a transient, transport-based error, > other than this? > > For instance, I'm using nbd as one of the mirror legs. In that case, > assuming the failures that lead to the device being marked faulty are > just transport/network issues, then we want the resync to be able to > correctly deal with this. It has always worked this way since a long > time ago. There was a fairly recent commit > (eeba6809d8d58908b5ed1b5ceb5fcb09a98a7cad) that re-arranged the code > (previously all write failures were retried via flagging with > R1BIO_WriteError). So I guess we need "Fixes eeba6809d8d589"? CC Yufen, who authored the above patch. > > Does the patch present a problem in some other scenario? I don't think this presents any problem. Applied to md-next. (so no need to resend for the Fix tag). Thanks, Song