Hi Coly Thank you for confirming. It looks like the 6.9 merge window just opened last week so we hope it can catch it. Please update in this thread when it gets submitted. https://lore.kernel.org/lkml/CAHk-=wiehc0DfPtL6fC2=bFuyzkTnuiuYSQrr6JTQxQao6pq1Q@xxxxxxxxxxxxxx/T/ BTW, speaking of testing, mind if you point us to the bcache test suite? We would like to have a look and maybe give it a try also. Thanks Robert On Sun, Mar 17, 2024 at 7:00 AM Coly Li <colyli@xxxxxxx> wrote: > > > > > 2024年3月17日 13:41,Robert Pang <robertpang@xxxxxxxxxx> 写道: > > > > Hi Coly > > > > Hi Robert, > > > Thank you for looking into this issue. > > > > We tested this patch in 5 machines with local SSD size ranging from > > 375 GB to 9 TB, and ran tests for 10 to 12 hours each. We observed no > > stall nor other issues. Performance was comparable before and after > > the patch. Hope this info will be helpful. > > Thanks for the information. > > Also I was told this patch has been deployed and shipped for 1+ year in easystack products, works well. > > The above information makes me feel confident for this patch. I will submit it in next merge window if some ultra testing loop passes. > > Coly Li > > > > > > > > On Fri, Mar 15, 2024 at 7:49 PM Coly Li <colyli@xxxxxxx> wrote: > >> > >> Hi Robert, > >> > >> Thanks for your email. > >> > >>> 2024年3月16日 06:45,Robert Pang <robertpang@xxxxxxxxxx> 写道: > >>> > >>> Hi all > >>> > >>> We found this patch via google. > >>> > >>> We have a setup that uses bcache to cache a network attached storage in a local SSD drive. Under heavy traffic, IO on the cached device stalls every hour or so for tens of seconds. When we track the latency with "fio" utility continuously, we can see the max IO latency shoots up when stall happens, > >>> > >>> latency_test: (groupid=0, jobs=1): err= 0: pid=50416: Fri Mar 15 21:14:18 2024 > >>> read: IOPS=62.3k, BW=486MiB/s (510MB/s)(11.4GiB/24000msec) > >>> slat (nsec): min=1377, max=98964, avg=4567.31, stdev=1330.69 > >>> clat (nsec): min=367, max=43682, avg=429.77, stdev=234.70 > >>> lat (nsec): min=1866, max=105301, avg=5068.60, stdev=1383.14 > >>> clat percentiles (nsec): > >>> | 1.00th=[ 386], 5.00th=[ 406], 10.00th=[ 406], 20.00th=[ 410], > >>> | 30.00th=[ 414], 40.00th=[ 414], 50.00th=[ 414], 60.00th=[ 418], > >>> | 70.00th=[ 418], 80.00th=[ 422], 90.00th=[ 426], 95.00th=[ 462], > >>> | 99.00th=[ 652], 99.50th=[ 708], 99.90th=[ 3088], 99.95th=[ 5600], > >>> | 99.99th=[11328] > >>> bw ( KiB/s): min=318192, max=627591, per=99.97%, avg=497939.04, stdev=81923.63, samples=47 > >>> iops : min=39774, max=78448, avg=62242.15, stdev=10240.39, samples=47 > >>> ... > >>> > >>> <IO stall> > >>> > >>> latency_test: (groupid=0, jobs=1): err= 0: pid=50416: Fri Mar 15 21:21:23 2024 > >>> read: IOPS=26.0k, BW=203MiB/s (213MB/s)(89.1GiB/448867msec) > >>> slat (nsec): min=958, max=40745M, avg=15596.66, stdev=13650543.09 > >>> clat (nsec): min=364, max=104599, avg=435.81, stdev=302.81 > >>> lat (nsec): min=1416, max=40745M, avg=16104.06, stdev=13650546.77 > >>> clat percentiles (nsec): > >>> | 1.00th=[ 378], 5.00th=[ 390], 10.00th=[ 406], 20.00th=[ 410], > >>> | 30.00th=[ 414], 40.00th=[ 414], 50.00th=[ 418], 60.00th=[ 418], > >>> | 70.00th=[ 418], 80.00th=[ 422], 90.00th=[ 426], 95.00th=[ 494], > >>> | 99.00th=[ 772], 99.50th=[ 916], 99.90th=[ 3856], 99.95th=[ 5920], > >>> | 99.99th=[10816] > >>> bw ( KiB/s): min= 1, max=627591, per=100.00%, avg=244393.77, stdev=103534.74, samples=765 > >>> iops : min= 0, max=78448, avg=30549.06, stdev=12941.82, samples=765 > >>> > >>> When we track per-second max latency in fio, we see something like this: > >>> > >>> <time-ms>,<max-latency-ns>,,, > >>> ... > >>> 777000, 5155548, 0, 0, 0 > >>> 778000, 105551, 1, 0, 0 > >>> 802615, 24276019570, 0, 0, 0 > >>> 802615, 82134, 1, 0, 0 > >>> 804000, 9944554, 0, 0, 0 > >>> 805000, 7424638, 1, 0, 0 > >>> > >>> fio --time_based --runtime=3600s --ramp_time=2s --ioengine=libaio --name=latency_test --filename=fio --bs=8k --iodepth=1 --size=900G --readwrite=randrw --verify=0 --filename=fio --write_lat_log=lat --log_avg_msec=1000 --log_max_value=1 > >>> > >>> We saw a smiliar issue reported in https://www.spinics.net/lists/linux-bcache/msg09578.html, which suggests an issue in garbage collection. When we trigger GC manually via "echo 1 > /sys/fs/bcache/a356bdb0-...-64f794387488/internal/trigger_gc", the stall is always reproduced. That thread points to this patch (https://www.spinics.net/lists/linux-bcache/msg08870.html) that we tested and the stall no longer happens. > >>> > >>> AFAIK, this patch marks buckets reclaimable at the beginning of GC to unblock the allocator so it does not need to wait for GC to finish. This periodic stall is a serious issue. Can the community look at this issue and this patch if possible? > >>> > >> > >> Could you please share more performance information of this patch? And how many nodes/how long time does the test cover so far? > >> > >> Last time I test the patch, it looked fine. But I was not confident how large scale and how long time this patch was tested. If you may provide more testing information, it will be helpful. > >> > >> > >> Coly Li > > >