Hi! When looking at htop, I see hundreds of kernel threads dedicated to bcache workers. Many of them go in and out of D state at the same time which may explain the spikes in load I'm seeing especially with writeback enabled (easily surpassing 200 loadavg). >From some older observations, the system feels smoother when turning bcache off (cache_mode = none) but obviously many file system operations are much slower. With bcache enabled, I'm seeing spontaneous desktop lags every once in a while. My bcache setup runs mainly on btrfs, with one additional xfs partition where I've put some background database workload which isn't exactly friends with btrfs. I'm wondering if this is expected. No other block-related driver seems to be such a heavy user of kernel threads: ``` # uname -a Linux jupiter 5.10.9-gentoo #3 SMP PREEMPT Sun Jan 24 22:43:57 CET 2021 x86_64 Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz GenuineIntel GNU/Linux # ps faux | grep bcache | wc -l 284 # lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 931,5G 0 disk ├─sda1 8:1 0 2G 0 part /efi ├─sda2 8:2 0 865,5G 0 part │ ├─bcache0 254:0 0 2,7T 0 disk # pooled with bcache4 │ ├─bcache1 254:128 0 2,7T 0 disk # pooled with bcache4 │ ├─bcache2 254:256 0 925,5G 0 disk # pooled with bcache4 │ ├─bcache3 254:384 0 931,5G 0 disk /mnt/xfs-storage │ └─bcache4 254:512 0 925,5G 0 disk /mnt/btrfs-pool └─sda3 8:3 0 64G 0 part [SWAP] sdb 8:16 0 2,7T 0 disk ├─sdb1 8:17 0 32G 0 part [SWAP] └─sdb2 8:18 0 2,7T 0 part └─bcache0 254:0 0 2,7T 0 disk sdc 8:32 0 931,5G 0 disk ├─sdc1 8:33 0 6G 0 part [SWAP] └─sdc2 8:34 0 925,5G 0 part └─bcache4 254:512 0 925,5G 0 disk /mnt/btrfs-pool sdd 8:48 0 931,5G 0 disk ├─sdd1 8:49 0 6G 0 part [SWAP] └─sdd2 8:50 0 925,5G 0 part └─bcache2 254:256 0 925,5G 0 disk sde 8:64 0 3,7T 0 disk ├─sde1 8:65 0 6G 0 part [SWAP] ├─sde2 8:66 0 2,7T 0 part │ └─bcache1 254:128 0 2,7T 0 disk └─sde3 8:67 0 931,5G 0 part └─bcache3 254:384 0 931,5G 0 disk /mnt/xfs-storage sdf 8:80 1 7,5G 0 disk └─sdf1 8:81 1 7,5G 0 part # dmesg | grep bcache [ 2.079349] bcache: bch_journal_replay() journal replay done, 503 keys in 131 entries, seq 50942189 [ 2.079739] bcache: register_cache() registered cache device sda2 [ 2.079837] bcache: register_bdev() registered backing device sdb2 [ 2.286757] bcache: bch_cached_dev_attach() Caching sdb2 as bcache0 on set df11a09d-b91a-4cd3-80e2-2684f5b9c1b2 [ 2.286846] bcache: register_bdev() registered backing device sde2 [ 2.385586] BTRFS: device label system devid 5 transid 2310589 /dev/bcache0 scanned by systemd-udevd (341) [ 2.394597] bcache: bch_cached_dev_attach() Caching sde2 as bcache1 on set df11a09d-b91a-4cd3-80e2-2684f5b9c1b2 [ 2.394688] bcache: register_bdev() registered backing device sdd2 [ 2.507326] bcache: bch_cached_dev_attach() Caching sdd2 as bcache2 on set df11a09d-b91a-4cd3-80e2-2684f5b9c1b2 [ 2.507416] BTRFS: device label system devid 3 transid 2310589 /dev/bcache1 scanned by systemd-udevd (341) [ 2.507422] bcache: register_bdev() registered backing device sde3 [ 2.604546] bcache: bch_cached_dev_attach() Caching sde3 as bcache3 on set df11a09d-b91a-4cd3-80e2-2684f5b9c1b2 [ 2.604645] bcache: register_bdev() registered backing device sdc2 [ 2.708189] bcache: bch_cached_dev_attach() Caching sdc2 as bcache4 on set df11a09d-b91a-4cd3-80e2-2684f5b9c1b2 [ 2.731318] BTRFS: device label system devid 2 transid 2310589 /dev/bcache2 scanned by systemd-udevd (339) [ 2.748793] BTRFS: device label system devid 1 transid 2310589 /dev/bcache4 scanned by systemd-udevd (336) [ 3.201845] BTRFS info (device bcache4): use zstd compression, level 3 [ 3.201847] BTRFS info (device bcache4): enabling ssd optimizations [ 3.201848] BTRFS info (device bcache4): using free space tree [ 3.201848] BTRFS info (device bcache4): has skinny extents [ 3.212303] BTRFS info (device bcache4): bdev /dev/bcache4 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 # ignore the corrupt 1, this was a one-time occurrence while scrubbing # I can no longer reproduce this, maybe cosmic rays flipped a bit during scrubbing ;-) [ 3.538177] BTRFS info (device bcache4): start tree-log replay [ 6.187652] BTRFS info (device bcache4): using free space tree [ 6.242252] XFS (bcache3): Mounting V5 Filesystem [ 6.386719] XFS (bcache3): Starting recovery (logdev: internal) [ 6.548018] XFS (bcache3): Ending recovery (logdev: internal) [ 6.989625] bcache: register_bcache() error : device already registered [ 6.991794] bcache: register_bcache() error : device already registered [ 7.007219] bcache: register_bcache() error : device already registered [ 7.300065] bcache: register_bcache() error : device already registered [ 7.301004] bcache: register_bcache() error : device already registered [ 7.370755] bcache: register_bcache() error : device already registered ``` Here's an excerpt of what I'm seeing: ``` # ps faux|grep bcache root 180 0.0 0.0 0 0 ? I< Jan24 0:00 \_ [bcache] root 381 0.0 0.0 0 0 ? I< Jan24 0:00 \_ [bcache_gc] root 399 0.0 0.0 0 0 ? S Jan24 0:01 \_ [bcache_allocato] root 400 0.0 0.0 0 0 ? S Jan24 0:10 \_ [bcache_gc] root 402 0.0 0.0 0 0 ? I< Jan24 0:00 \_ [bcache_writebac] root 403 0.2 0.0 0 0 ? S Jan24 1:23 \_ [bcache_writebac] root 408 0.0 0.0 0 0 ? S Jan24 0:00 \_ [bcache_status_u] root 410 0.0 0.0 0 0 ? I< Jan24 0:00 \_ [bcache_writebac] root 411 0.2 0.0 0 0 ? S Jan24 1:23 \_ [bcache_writebac] root 417 0.0 0.0 0 0 ? S Jan24 0:00 \_ [bcache_status_u] root 419 0.0 0.0 0 0 ? I< Jan24 0:00 \_ [bcache_writebac] root 420 0.1 0.0 0 0 ? S Jan24 0:51 \_ [bcache_writebac] root 425 0.0 0.0 0 0 ? S Jan24 0:00 \_ [bcache_status_u] root 427 0.0 0.0 0 0 ? I< Jan24 0:00 \_ [bcache_writebac] root 428 0.0 0.0 0 0 ? S Jan24 0:00 \_ [bcache_writebac] root 433 0.0 0.0 0 0 ? S Jan24 0:00 \_ [bcache_status_u] root 435 0.0 0.0 0 0 ? I< Jan24 0:00 \_ [bcache_writebac] root 436 0.1 0.0 0 0 ? S Jan24 0:51 \_ [bcache_writebac] root 441 0.0 0.0 0 0 ? S Jan24 0:00 \_ [bcache_status_u] root 570 0.0 0.0 0 0 ? I< Jan24 0:00 \_ [xfs-buf/bcache3] root 571 0.0 0.0 0 0 ? I< Jan24 0:00 \_ [xfs-conv/bcache] root 572 0.0 0.0 0 0 ? I< Jan24 0:00 \_ [xfs-cil/bcache3] root 575 0.0 0.0 0 0 ? I< Jan24 0:00 \_ [xfs-log/bcache3] root 576 0.0 0.0 0 0 ? S Jan24 0:03 \_ [xfsaild/bcache3] root 72935 0.0 0.0 0 0 ? I 01:08 0:00 \_ [kworker/4:33-bcache_writeback_wq] root 72946 0.0 0.0 0 0 ? I 01:08 0:00 \_ [kworker/4:88-bcache] root 73171 0.0 0.0 0 0 ? I 01:15 0:00 \_ [kworker/4:122-bcache] root 74312 0.0 0.0 0 0 ? I 01:57 0:00 \_ [kworker/4:205-bcache_writeback_wq] root 74649 0.0 0.0 0 0 ? I 02:09 0:00 \_ [kworker/4:177-bcache] root 74658 0.0 0.0 0 0 ? I 02:09 0:00 \_ [kworker/4:238-bcache] root 76622 0.0 0.0 0 0 ? I 03:08 0:00 \_ [kworker/4:222-bcache] root 76658 0.0 0.0 0 0 ? I 03:08 0:00 \_ [kworker/4:318-bcache] ... root 111243 0.0 0.0 0 0 ? I 03:49 0:00 \_ [kworker/4:165-bcache_writeback_wq] root 111628 0.0 0.0 0 0 ? I 04:05 0:00 \_ [kworker/4:11-bcache_writeback_wq] root 111643 0.0 0.0 0 0 ? I 04:05 0:00 \_ [kworker/4:124-bcache_writeback_wq] root 111679 0.0 0.0 0 0 ? I 04:05 0:00 \_ [kworker/4:245-bcache] root 111697 0.0 0.0 0 0 ? I 04:05 0:00 \_ [kworker/4:295-bcache_writeback_wq] root 111720 0.0 0.0 0 0 ? I 04:05 0:00 \_ [kworker/4:337-bcache] root 111860 0.0 0.0 0 0 ? I 04:09 0:00 \_ [kworker/4:386-bcache] root 111862 0.0 0.0 0 0 ? I 04:09 0:00 \_ [kworker/4:390-bcache_writeback_wq] root 111866 0.0 0.0 0 0 ? I 04:09 0:00 \_ [kworker/4:394-bcache_writeback_wq] root 111867 0.0 0.0 0 0 ? I 04:09 0:00 \_ [kworker/4:395-bcache] root 111880 0.0 0.0 0 0 ? I 04:09 0:00 \_ [kworker/4:408-bcache_writeback_wq] root 111882 0.0 0.0 0 0 ? I 04:09 0:00 \_ [kworker/4:410-bcache] root 111896 0.0 0.0 0 0 ? I 04:09 0:00 \_ [kworker/4:424-bcache] root 111897 0.0 0.0 0 0 ? I 04:09 0:00 \_ [kworker/4:425-bcache] ... root 114184 0.0 0.0 0 0 ? I 05:28 0:00 \_ [kworker/4:181-bcache_writeback_wq] root 114186 0.0 0.0 0 0 ? I 05:28 0:00 \_ [kworker/4:187-bcache_writeback_wq] root 114187 0.0 0.0 0 0 ? I 05:28 0:00 \_ [kworker/4:188-bcache_writeback_wq] root 114193 0.0 0.0 0 0 ? I 05:28 0:00 \_ [kworker/4:207-bcache_writeback_wq] ... root 117073 0.0 0.0 0 0 ? I 06:48 0:00 \_ [kworker/4:182-bcache_writeback_wq] root 117074 0.0 0.0 0 0 ? I 06:48 0:00 \_ [kworker/4:190-bcache_writeback_wq] root 117077 0.0 0.0 0 0 ? I 06:48 0:00 \_ [kworker/4:218-bcache] root 117081 0.0 0.0 0 0 ? I 06:48 0:00 \_ [kworker/4:239-bcache_writeback_wq] root 117084 0.0 0.0 0 0 ? I 06:48 0:00 \_ [kworker/4:249-bcache] root 117087 0.0 0.0 0 0 ? I 06:48 0:00 \_ [kworker/4:264-bcache_writeback_wq] root 117088 0.0 0.0 0 0 ? I 06:48 0:00 \_ [kworker/4:269-bcache] root 118151 0.0 0.0 0 0 ? I 06:53 0:00 \_ [kworker/4:299-bcache] root 118152 0.0 0.0 0 0 ? I 06:53 0:00 \_ [kworker/4:300-bcache] root 118154 0.0 0.0 0 0 ? I 06:53 0:00 \_ [kworker/4:305-bcache] root 118159 0.0 0.0 0 0 ? I 06:53 0:00 \_ [kworker/4:313-bcache] root 118170 0.0 0.0 0 0 ? I 06:53 0:00 \_ [kworker/4:326-bcache_writeback_wq] root 118172 0.0 0.0 0 0 ? I 06:53 0:00 \_ [kworker/4:328-bcache] root 118179 0.0 0.0 0 0 ? I 06:53 0:00 \_ [kworker/4:335-bcache] root 118184 0.0 0.0 0 0 ? I 06:53 0:00 \_ [kworker/4:341-bcache_writeback_wq] ... ``` During high write work-loads, I'm sometimes seeing over 2000 kthreads (most of them are bcache workers). TIA Kai