Read error in superblock not handled well by MD

"Nik.Brt." <nik.brt@xxxxxxxxxxxxx> · Mon, 18 Apr 2022 14:21:18 +0200

Hello all,

recently in a server we had what seemed to be a sporadic corruption 
during write, and that happened during the write of the superblock and 
the bitmap on one disk of an array.

This corruption resulted in two consecutive 4k sectors (superblock, 
bitmap) being unreadable on a disk which was otherwise good.
The array was a raid1 with 2 disks. The disks are of model WDC 
WD60EFRX-68MYMN1.
We realized about that error due to SMART long tests, because MD/mdadm 
would not tell us anything.
Trying to read with dd, we could confirm the on-disk problem (read error).
Also mdadm --examine and --examine-bitmap could obviously not read any 
valid data from there

After this episode, MD didn't behave well IMHO.

During array checks the error was not reported and the superblock and 
the bitmap on that disk would never be rewritten; during event count 
changes the superblock on that disk was never rewritten (it was written 
on the other disk of the array), and during writes to the array, the 
bitmap of that disk was never rewritten (it was written on the other 
disk of the array).
The array stayed up otherwise, but had we restarted the server, it would 
have restarted with 1 disk only.

We waited days to see if the problem would resolve on its own but it 
wouldn't.
Then we went in and used dd to overwrite those two 4k sectors with zeroes.
The disk was good so this solved the read error problem instantly and at 
the first attempt.

After a very short time, less than 2 minutes, MD restarted rewriting 
those sectors so we again had a good superblock and good bitmap on the 
previously-bad disk.

So I suppose what MD does is: before updating the superblock and/or the 
bitmap, MD tries to read such sectors. If it encounters a read error it 
refrains from rewriting such sectors, however reading zeroes (a clearly 
invalid value) is apparently fine.

I'm not sure of why the algorithm is like this, but it prevents to fix a 
disk surface problem / read error on disks in the superblock and/or 
bitmap areas, and those are not fixed even during check/repair actions 
for the array.

I propose that MD should write those sectors without attempting to read 
them first.

Thank you
N.Br. (prefer not to be acknowledged for this bug report or fix)