Re: strange problem with raid6 read errors on active non-degraded array

Ethan Wilson <ethan.wilson@xxxxxxxxxxxxx> · Wed, 02 Jul 2014 23:34:53 +0200

On 02/07/2014 20:28, Pedro Teixeira wrote:

Hi Ethan,

The thing here is that some of the bad blocks ( if not all ) that are 
giving read errors are not on the bad blocks list.

Are you sure? Please note that the offset is a complex topic because an 
offset given by fsck will be a sector offset in the md0 sense, while the 
device badblock list contains offset in the device sense, which means 
that to convert one onto the other you have to divide, or multiply, by 
the number of data disks, approximately, and handle the remainder 
manually also considering the problem of the rotating parity. Not 
simple. Is this the computation that you did?

Specifically, the ones that show up when doing a fsck are not on any 
drive. For these sectors fsck tries to re-write then and md still 
throws an error but they are not added to the list.

Not "added" but "removed". Writing to a bad block should create valid 
content so they should be removed from the list. If they don't then 
indeed there is probably a bug in the MD code, see my previous post.

I replaced sdm with a new disk. this was one that had a bunch or bad 
blocks reported by md, and after finishing the rebuild ( with no 
errors at all ) the --examine-badblocks still gives me the exact same 
list of errors. I would expect that replacing the disk by a new one 
would clear the errors.

This is the correct behaviour by design.
Source disks did not have valid content in those positions, so good data 
cannot be created from nothing. Badblocks will be replicated onto the 
new disk.
"Bad" here is more a synonym of "containing invalid data", not really 
"unreadable surface".

as I know the disks are good, is there any way of reseting the bad 
blocks list without destroying the filesystem?

This one I don't know but doing that would probably not help to find the 
bug.

Regads
EW

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html