I/O error on cache device can cause user observable errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The bcache documentation says that errors on the cache device are
handled transparently.

I'm seeing a case where the cache device is unregistered in response
to repeated write errors (expected), but that results in a read error
on the bcache device (unexpected).

Here's how I'm reproducing the problem:
1. Create a device with dm-error to simulate I/O errors. The device is
1G in size and it will fail I/Os in a 4M extent starting at offset
128M:
    $ dmsetup create cache_disk << EOF
      0      262144    linear /dev/sdb 0
      262144 8192      error
      270336 1826816   linear /dev/sdb 270336
    EOF

2. Set up bcache in writethrough mode. The backing device is 1000G in length:
    $ make-bcache --cache /dev/mapper/cache_disk --bdev /dev/sdc
--wipe-bcache --bucket 256k
    $ echo writethrough > /sys/block/bcache0/bcache/cache_mode
    $ echo 0 > /sys/block/bcache0/bcache/cache/synchronous

    $ lsblk
    NAME         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
    ...
    sdb            8:16   0    10G  0 disk
    └─cache_disk 253:0    0     1G  0 dm
      └─bcache0  252:0    0  1000G  0 disk
    sdc            8:32   0  1000G  0 disk
    └─bcache0    252:0    0  1000G  0 disk

3. Start a random read workload on the bcache device (using fio):
    $ fio --name=basic --filename=/dev/bcache0 --size=1000G
--rw=randread  --blocksize=256k --blockalign=256k

4. After a while I see that the cache device gets unregistered.
However, the application output indicates it saw an I/O error on a
read request:
     fio: io_u error on file /dev/bcache0: Input/output error: read
offset=592264298496, buflen=262144

I can see in the syslogs that bcache unregistered the cache. The logs
also show that there was an I/O error on the bcache device:
    Feb  1 19:47:23 armont-bcache-test kernel: [ 3327.176867] bcache:
bch_count_io_errors() dm-0: IO error on writing data to cache.
    Feb  1 19:47:23 armont-bcache-test kernel: [ 3327.186494] bcache:
bch_count_io_errors() dm-0: IO error on writing data to cache.
    Feb  1 19:47:23 armont-bcache-test kernel: [ 3327.195743] bcache:
bch_count_io_errors() dm-0: IO error on writing data to cache.
    Feb  1 19:47:23 armont-bcache-test kernel: [ 3327.204869] bcache:
bch_count_io_errors() dm-0: IO error on writing data to cache.
    Feb  1 19:47:23 armont-bcache-test kernel: [ 3327.234722] bcache:
bch_count_io_errors() dm-0: IO error on writing data to cache.
    Feb  1 19:47:23 armont-bcache-test kernel: [ 3327.246102] bcache:
bch_count_io_errors() dm-0: IO error on writing data to cache.
    Feb  1 19:47:23 armont-bcache-test kernel: [ 3327.274013] bcache:
bch_count_io_errors() dm-0: IO error on writing data to cache.
    Feb  1 19:47:23 armont-bcache-test kernel: [ 3327.289128] bcache:
bch_cache_set_error() error on 427201f5-5c86-4890-9866-f9860e518041:
dm-0: too many IO errors writing data to cache
    Feb  1 19:47:23 armont-bcache-test kernel: [ 3327.289128] ,
disabling caching
    Feb  1 19:47:23 armont-bcache-test kernel: [ 3327.306212] bcache:
conditional_stop_bcache_device() stop_when_cache_set_failed of bcache0
is "auto" and cache is clean, keep it alive.
    Feb  1 19:47:23 armont-bcache-test kernel: [ 3327.306543] Buffer
I/O error on dev bcache0, logical block 144595776, async page read
    Feb  1 19:47:23 armont-bcache-test kernel: [ 3327.316119] bcache:
cached_dev_detach_finish() Caching disabled for sdc
    Feb  1 19:47:23 armont-bcache-test kernel: [ 3327.316398] bcache:
cache_set_free() Cache set 427201f5-5c86-4890-9866-f9860e518041
unregistered

The steps above reproduce the problem most of the time, but not
always. In a few of the attempts, the cache was unregistered without
resulting in observable I/O errors.

Is this expected?

I'm running the Linux kernel version 6.5.0.





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux