Re: BUG: bcache failing on top of degraded RAID-6

Thorsten Knabe <linux@xxxxxxxxxxxxxxxxx> · Tue, 14 May 2019 10:55:00 +0200

On 5/13/19 5:36 PM, Coly Li wrote:
> On 2019/5/9 3:43 上午, Coly Li wrote:
>> On 2019/5/8 11:58 下午, Thorsten Knabe wrote:
> [snipped]
> 
>>> Hi Cody.
>>>
>>>> I cannot do this. Because this is real I/O issued to backing device, if
>>>> it failed, it means something really wrong on backing device.
>>>
>>> I have not found a definitive answer or documentation what the
>>> REQ_RAHEAD flag is actually used for. However in my understanding, after
>>> reading a lot of kernel source, it is used as an indication, that the
>>> bio read request is unimportant for proper operation and may be failed
>>> by the block device driver returning BLK_STS_IOERR, if it is too
>>> expensive or requires too many additional resources.
>>>
>>> At least the BTRFS and DRBD code do not take bio request IO errors that
>>> are marked with the REQ_RAHEAD flag into account in their error
>>> counters. Thus it is probably okay if such IO errors with the REQ_RAHEAD
>>> flags set are not counted as errors by bcache too.
>>>
>>>>
>>>> Hmm, If raid6 may returns different error code in bio->bi_status, then
>>>> we can identify this is a failure caused by raid degrade, not a read
>>>> hardware or link failure. But now I am not familiar with raid456 code,
>>>> no idea how to change the md raid code (I assume you meant md raid6)...
>>>
>>> I my assumptions above regarding the REQ_RAHEAD flag are correct, then
>>> the RAID code is correct, because restoring data from the parity
>>> information is a relatively expensive operation for read-ahead data,
>>> that is possibly never actually needed.
>>
>>
>> Hi Thorsten,
>>
>> Thank you for the informative hint. I agree with your idea, it seems
>> ignoring I/O error of REQ_RAHEAD bios does not hurt. Let me think how to
>> fix it by your suggestion.
>>
> 
> Hi Thorsten,
> 
> Could you please to test the attached patch ?
> Thanks in advance.
> 

Hi Cody.

I have applied your patch to a 3 systems running Linux 5.1.1 yesterday
evening, on one of them I removed a disk from the RAID6 array.

The patch works as expected. The system with the removed disk has logged
more than 1300 of the messages added by your patch. Most of them have
been logged shortly after boot up and a few shorter burst evenly spread
over the runtime of the system.

Probably it would be a good idea to apply some sort of rate limit to the
log message. I could imagine that a different file system or I/O pattern
could cause a lot more of these message.

When we are at it. The WARN_ONCE at the beginning of the function looks
a bit suspicious too. It logs a warning when dc is NULL but the function
continues and dereferences dc later on. -> Oops!

Regards
Thorsten

-- 
___
 |        | /                 E-Mail: linux@xxxxxxxxxxxxxxxxx
 |horsten |/\nabe                WWW: http://linux.thorsten-knabe.de