Re: Large latency with bcache for Ceph OSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2021/2/24 下午4:52, Coly Li wrote:
> On 2/22/21 7:48 AM, Norman.Kern wrote:
>> Ping.
>>
>> I'm confused on the SYNC I/O on bcache. why SYNC I/O must be writen back
>> for persistent cache?  It can cause some latency.
>>
>> @Coly, can you give help me to explain why bcache handle O_SYNC like this.?
>>
>>
> Hmm, normally we won't observe the application issuing I/Os on backing
> device except for,
> - I/O bypass by SSD congestion
> - Sequential I/O request
> - Dirty buckets exceeds the cutoff threshold
> - Write through mode
>
> Do you set the write/read congestion threshold to 0 ?

Thanks for you reply.

I have set the threshold to zero, all configs:

#make-bcache -C -b 4m -w 4k --discard --cache_replacement_policy=lru /dev/sdm
#make-bcache -B --writeback -w 4KiB /dev/sdn --wipe-bcache
congested_read_threshold_us = 0
congested_write_threshold_us = 0

# I tried to set sequential_cutoff to 0, but it didn't solve it.

sequential_cutoff = 4194304
writeback_percent = 40
cache_mode = writeback

I renew the cluster, run for hours and reproduced the problem. I check the cache status:

root@WXS0106:/root/perf-tools# cat /sys/fs/bcache/d87713c6-2e76-4a09-8517-d48306468659/cache_available_percent
29
root@WXS0106:/root/perf-tools# cat /sys/fs/bcache/d87713c6-2e76-4a09-8517-d48306468659/internal/cutoff_writeback_sync
70
'Dirty buckets exceeds the cutoff threshold' caused the problem, is my config wrong or other reasons?

>
> Coly Li
>
>> On 2021/2/18 下午3:56, Norman.Kern wrote:
>>> Hi guys,
>>>
>>> I am testing ceph with bcache, I found some I/O with O_SYNC writeback
>>> to HDD, which caused large latency on HDD, I trace the I/O with iosnoop:
>>>
>>> ./iosnoop  -Q -ts -d '8,192
>>>
>>> Tracing block I/O for 1 seconds (buffered)...
>>> STARTs          ENDs            COMM         PID    TYPE DEV
>>> BLOCK        BYTES     LATms
>>>
>>> 1809296.292350  1809296.319052  tp_osd_tp    22191  R    8,192
>>> 4578940240   16384     26.70
>>> 1809296.292330  1809296.320974  tp_osd_tp    22191  R    8,192
>>> 4577938704   16384     28.64
>>> 1809296.292614  1809296.323292  tp_osd_tp    22191  R    8,192
>>> 4600404304   16384     30.68
>>> 1809296.292353  1809296.325300  tp_osd_tp    22191  R    8,192
>>> 4578343088   16384     32.95
>>> 1809296.292340  1809296.328013  tp_osd_tp    22191  R    8,192
>>> 4578055472   16384     35.67
>>> 1809296.292606  1809296.330518  tp_osd_tp    22191  R    8,192
>>> 4578581648   16384     37.91
>>> 1809295.169266  1809296.334041  bstore_kv_fi 17266  WS   8,192
>>> 4244996360   4096    1164.78
>>> 1809296.292618  1809296.336349  tp_osd_tp    22191  R    8,192
>>> 4602631760   16384     43.73
>>> 1809296.292618  1809296.338812  tp_osd_tp    22191  R    8,192
>>> 4602632976   16384     46.19
>>> 1809296.030103  1809296.342780  tp_osd_tp    22180  WS   8,192
>>> 4741276048   131072   312.68
>>> 1809296.292347  1809296.345045  tp_osd_tp    22191  R    8,192
>>> 4609037872   16384     52.70
>>> 1809296.292620  1809296.345109  tp_osd_tp    22191  R    8,192
>>> 4609037904   16384     52.49
>>> 1809296.292612  1809296.347251  tp_osd_tp    22191  R    8,192
>>> 4578937616   16384     54.64
>>> 1809296.292621  1809296.351136  tp_osd_tp    22191  R    8,192
>>> 4612654992   16384     58.51
>>> 1809296.292341  1809296.353428  tp_osd_tp    22191  R    8,192
>>> 4578220656   16384     61.09
>>> 1809296.292342  1809296.353864  tp_osd_tp    22191  R    8,192
>>> 4578220880   16384     61.52
>>> 1809295.167650  1809296.358510  bstore_kv_fi 17266  WS   8,192
>>> 4923695960   4096    1190.86
>>> 1809296.292347  1809296.361885  tp_osd_tp    22191  R    8,192
>>> 4607437136   16384     69.54
>>> 1809296.029363  1809296.367313  tp_osd_tp    22180  WS   8,192
>>> 4739824400   98304    337.95
>>> 1809296.292349  1809296.370245  tp_osd_tp    22191  R    8,192
>>> 4591379888   16384     77.90
>>> 1809296.292348  1809296.376273  tp_osd_tp    22191  R    8,192
>>> 4591289552   16384     83.92
>>> 1809296.292353  1809296.378659  tp_osd_tp    22191  R    8,192
>>> 4578248656   16384     86.31
>>> 1809296.292619  1809296.384835  tp_osd_tp    22191  R    8,192
>>> 4617494160   65536     92.22
>>> 1809295.165451  1809296.393715  bstore_kv_fi 17266  WS   8,192
>>> 1355703120   4096    1228.26
>>> 1809295.168595  1809296.401560  bstore_kv_fi 17266  WS   8,192
>>> 1122200      4096    1232.96
>>> 1809295.165221  1809296.408018  bstore_kv_fi 17266  WS   8,192
>>> 960656       4096    1242.80
>>> 1809295.166737  1809296.411505  bstore_kv_fi 17266  WS   8,192
>>> 57682504     4096    1244.77
>>> 1809296.292352  1809296.418123  tp_osd_tp    22191  R    8,192
>>> 4579459056   32768    125.77
>>>
>>> I'm confused why write with O_SYNC must writeback on the backend
>>> storage device?  And when I used bcache for a time,
>>>
>>> the latency increased a lot.(The SSD is not very busy), There's some
>>> best practices on configuration?
>>>




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux