Re: BUG: bcache failing on top of degraded RAID-6

Thorsten Knabe <linux@xxxxxxxxxxxxxxxxx> · Wed, 27 Mar 2019 12:00:42 +0100

On 3/27/19 10:44 AM, Coly Li wrote:
> On 2019/3/26 9:21 下午, Thorsten Knabe wrote:
>> Hello,
>>
>> there seems to be a serious problem, when running bcache on top of a
>> degraded RAID-6 (MD) array. The bcache device /dev/bcache0 disappears
>> after a few I/O operations on the affected device and the kernel log
>> gets filled with the following log message:
>>
>> bcache: bch_count_backing_io_errors() md0: IO error on backing device,
>> unrecoverable
>>
> 
> It seems I/O request onto backing device failed. If the md raid6 device
> is the backing device, does it go into read-only mode after degrade ?

No, the RAID6 backing device is still in read-write mode after the disk
has been removed from the RAID array. That's the way RAID6 is supposed
to work.

> 
> 
>> Setup:
>> Linux kernel: 5.1-rc2, 5.0.4, 4.19.0-0.bpo.2-amd64 (Debian backports)
>> all affected
>> bcache backing device: EXT4 filesystem -> /dev/bcache0 -> /dev/md0 ->
>> /dev/sd[bcde]1
>> bcache cache device: /dev/sdf1
>> cache mode: writethrough, none and cache device detached are all
>> affected, writeback and writearound has not been tested
>> KVM for testing, first observed on real hardware (failing RAID device)
>>
>> As long as the RAID6 is healthy, bcache works as expected. Once the
>> RAID6 gets degraded, for example by removing a drive from the array
>> (mdadm --fail /dev/md0 /dev/sde1, mdadm --remove /dev/md0 /dev/sde1),
>> the above-mentioned log messages appear in the kernel log and the bcache
>> device /dev/bcache0 disappears shortly afterwards logging:
>>
>> bcache: bch_cached_dev_error() stop bcache0: too many IO errors on
>> backing device md0
>>
>> to the kernel log.
>>
>> Increasing /sys/block/bcache0/bcache/io_error_limit to a very high value
>> (1073741824) the bcache device /dev/bcache0 remains usable without any
>> noticeable filesystem corruptions.
> 
> If the backing device goes into read-only mode, bcache will take this
> backing device as a failure status. The behavior is to stop the bcache
> device of the failed backing device, to notify upper layer something
> goes wrong.
> 
> In writethough and writeback mode, bcache requires the backing device to
> be writable.

But, the degraded (one disk of the array missing) RAID6 device is still
writable.

Also after raising the io_error_limit of the bcache device to a very
high value (1073741824 in my tests) I can use the bcache device on the
degraded RAID6 array for hours reading and writing gigabytes of data,
without getting any I/O errors or observing any filesystem corruptions.
I'm just getting a lot of those

bcache: bch_count_backing_io_errors() md0: IO error on backing device,
unrecoverable

messages in the kernel log.

It seems that I/O requests for data that have been successfully
recovered by the RAID6 from the redundant information stored on the
additional disks are accidentally counted as failed I/O requests and
when the configured io_error_limit for the bcache device is reached, the
bcache device gets stopped.

> 
> Thanks.
> 

Regards
Thorsten

-- 
___
 |        | /                 E-Mail: linux@xxxxxxxxxxxxxxxxx
 |horsten |/\nabe                WWW: http://linux.thorsten-knabe.de