On Thu, Feb 9, 2023 at 1:22 PM <benard_bsc@xxxxxxxxxxx> wrote:
> I believe I have found a bug in bcache where the btree grows out of
> control and makes operations like garbage collection take a very large
> amount of time affecting client IO. I can see periodic periods where
> bcache devices stop responding to client IO and the cache device starts
> doing a lage amount of reads. In order to test the above I triggered gc
> manually using 'echo 1 > trigger_gc' and observing the cache set. Once
> again a large amount of reads start happening on the cache device and
> all the bcache devices of that cache set stop responding. I believe
> this is becouse gc blocks all client IO while its happening (might be
> wrong). Checking the stats I can see that the
> 'btree_gc_average_duration_ms'  is almost 2 minutes
> (btree_gc_average_duration_ms) which seems excessively large to me.
> Furthermore doing something like checking bset_tree_stats will just
> hang and cause a similar performance impact.
> An interesting thing to note is that after garbage collection the
> number of btree nodes is lower but the btree cache actually grows in
> size.
> Example:
> /sys/fs/bcache/c_set# cat btree_cache_size
> 5.2G
> /sys/fs/bcache/c_set# cat internal/btree_nodes
> 28318
> /sys/fs/bcache/c_set# cat average_key_size
> 25.2k
> Just for reference I have a similar environment (which is busier and
> has more data stored) which doesnt experience this issue and the
> numbers for the above are:
> /sys/fs/bcache/c_set# cat btree_cache_size
> 840.5M
> /sys/fs/bcache/c_set# cat internal/btree_nodes
> 3827
> /sys/fs/bcache/c_set# cat average_key_size
> 88.3k
> Kernel version: 5.4.0-122-generic
> OS version: Ubuntu 18.04.6 LTS
Hi Bernard,
your linux distro and kernel version are quite old. There are good
chances that things got fixed in the meanwhile. Would it be possible
for you to try to reproduce your bug with a newer kernel?

> bcache-tools package: 1.0.8-2ubuntu0.18.04.1
> I am able to provide more info if needed
> Regards

