Re: bcache writeback infinite loop?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019/10/24 10:40 上午, Larkin Lowrey wrote:
> I have a backing device that is constantly writing due to
> bcache_writebac. It has been at 14.3.MB dirty all day and has not
> changed. There's nothing else writing to it.
> 
> This started after I "upgraded" from Fedora 29 to 30 and consequently
> from kernel 5.2.18 to 5.3.6.
> 
> You can see from the info below that the writeback process is chewing
> up  A LOT of CPU and writing constantly at ~7MB/s. It sure looks like
> it's in an infinite loop and writing the same data over and over. At
> least I hope that's the case and it's not just filling the array with
> garbage.
> 
> This configuration has been stable for many years and across many Fedora
> upgrades. The host has ECC memory so RAM corruption should not be a
> concern. I have not had any recent controller or drive failures.
> 
>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+
> COMMAND
>  5915 root      20   0       0      0      0 R  94.1   0.0 891:45.42
> bcache_writebac
> 
> 
> Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s 
> %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
> md3              0.02 1478.77      0.00      6.90     0.00     0.00  
> 0.00   0.00    0.00    0.00   0.00     7.72     4.78   0.00   0.00
> md3              0.00 1600.00      0.00      7.48     0.00     0.00  
> 0.00   0.00    0.00    0.00   0.00     0.00     4.79   0.00   0.00
> md3              0.00 1300.00      0.00      6.07     0.00     0.00  
> 0.00   0.00    0.00    0.00   0.00     0.00     4.78   0.00   0.00
> md3              0.00 1500.00      0.00      7.00     0.00     0.00  
> 0.00   0.00    0.00    0.00   0.00     0.00     4.78   0.00   0.00
> 
> --- bcache ---
> UUID                        dc2877bc-d1b3-43fa-9f15-cad018e73bf6
> Block Size                  512 B
> Bucket Size                 512.00 KiB
> Congested?                  False
> Read Congestion             2.0ms
> Write Congestion            20.0ms
> Total Cache Size            128 GiB
> Total Cache Used            128 GiB     (100%)
> Total Cache Unused          0 B (0%)
> Evictable Cache             127 GiB     (99%)
> Replacement Policy          [lru] fifo random
> Cache Mode                  writethrough [writeback] writearound none
> Total Hits                  49872       (97%)
> Total Misses                1291
> Total Bypass Hits           659 (77%)
> Total Bypass Misses         189
> Total Bypassed              5.8 MiB
> --- Cache Device ---
>   Device File               /dev/dm-3 (253:3)
>   Size                      128 GiB
>   Block Size                512 B
>   Bucket Size               512.00 KiB
>   Replacement Policy        [lru] fifo random
>   Discard?                  False
>   I/O Errors                0
>   Metadata Written          5.0 MiB
>   Data Written              86.1 MiB
>   Buckets                   262144
>   Cache Used                128 GiB     (100%)
>   Cache Unused              0 B (0%)
> --- Backing Device ---
>   Device File               /dev/md3 (9:3)
>   bcache Device File        /dev/bcache0 (252:0)
>   Size                      73 TiB
>   Cache Mode                writethrough [writeback] writearound none
>   Readahead                 0.0k
>   Sequential Cutoff         4.0 MiB
>   Merge sequential?         False
>   State                     dirty
>   Writeback?                True
>   Dirty Data                14.3 MiB
>   Total Hits                49872       (97%)
>   Total Misses              1291
>   Total Bypass Hits         659 (77%)
>   Total Bypass Misses       189
>   Total Bypassed            5.8 MiB
> 
> I have not tried reverting back to an earlier kernel. I'm concerned
> about possible corruption. Is that safe? Any other suggestions as to how
> to debug and/or resolve this issue?

>From 5.2 to 5.3, there is only a few change related to writeback, and I
don't find obviously suspicious location for the infinite writback loop.

Since in 5.3 there are many issue fixed, it might be possible that
another problem shows up because the previous problem fixed.

Do you see anything suspicious in kernel message log ? or is it possible
to tar up and compress the /sys/fs/bcache/<cache-set-uuid> and
/sys/block/bcache0/ directories and emailed them to me ?

And if you may run perf on the writback thread to sample the hot
location in the infinite loop, maybe I can find some clue if lucky.

Thanks.

-- 

Coly Li



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux