Re: The serious side-effect of rbd cache setting

Frank Schilder <frans@xxxxxx> · Fri, 20 Nov 2020 14:25:49 +0000

Hmm, so maybe your hardware is good enough that cache is actually not helping? This is not unheard of. I don't really see any improvement from caching to begin with. On the other hand, a synthetic benchmark is not really a test that utilises the good sides of cache (in particular, write merges will probably not occur). It would probably make more sense to run real VMs with real workload for a while and monitor latencies etc. over a longer period of time.

Other than that I only see that the cache size is quite small. You do 100G random operations on a 16M cache, the default is 32M. I would not expect anything interesting from this ratio. Cache only makes sense if you have a lot of cache hits.

In addition, the max_dirty and target_dirty values are really high percentage-wise. This could lead to a lot of deferred  operations for too long and result in a cache flush blocking IO.

Larger cache size, smaller targets for dirty and a benchmark that simulates a realistic workload might be worth to investigate.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: norman <norman.kern@xxxxxxx>
Sent: 20 November 2020 13:58:27
To: Frank Schilder
Cc: ceph-users
Subject: Re:  The serious side-effect of rbd cache setting

If the rbd cache = false,  and run the same two tests, the read iops is
stable(this is a new cluster without stress):

   109    274471   2319.41  9500308.72
   110    276846   2380.81  9751782.65
   111    278969   2431.40  9959023.39
   112    280924   2287.21  9368428.23
   113    282886   2227.82  9125145.62
   114    286130   2331.61  9550275.83
   115    289693   2569.19  10523406.25
   116    293161   2838.17  11625140.61
   117    296484   3111.75  12745715.04
   118    300068   3436.12  14074349.33
   119    302424   3258.53  13346958.90
   120    304442   2949.56  12081397.86
   121    306988   2765.18  11326156.91
   122    309867   2676.38  10962461.69
   123    312475   2481.20  10162987.53
   124    314957   2506.40  10266198.33
   125    317124   2536.19  10388249.19
   126    320239   2649.98  10854336.06
   127    323243   2674.98  10956727.73
   128    326688   2842.37  11642342.34
   129    328855   2779.37  11384315.33
   130    331414   2857.77  11705415.59
   131    333811   2714.18  11117277.84
   132    336164   2583.99  10584022.02
   133    338664   2395.01  9809941.00
   134    341417   2512.20  10289953.14
   135    344409   2598.79  10644637.88
   136    347112   2659.98  10895292.68
   137    349486   2664.18  10912494.47
   138    351921   2651.18  10859250.80
   139    354592   2634.79  10792081.86
   140    357559   2629.79  10771603.52

On 20/11/2020 下午8:50, Frank Schilder wrote:
> Do you have test results for the same test without caching?
>
> I have seen periodic stalls in any RBD IOP/s benchmark on ceph. The benchmarks create IO requests much faster than OSDs can handle them. At some point all queues run full and you start seeing slow ops on OSDs.
>
> I would also prefer if IO activity was more steady and not so bursty, but for some reason IO client throttling is pushed to the clients instead of the internal OPS queueing system (ceph is collaborative, meaning a rogue un-collaborative client can screw it up for everyone).
>
> If you know what your IO stack can handle without stalls, you can use libvirt QOS settings to limit clients with reasonable peak-load and steady-load settings.
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: norman <norman.kern@xxxxxxx>
> Sent: 20 November 2020 13:40:18
> To: ceph-users
> Subject:  The serious side-effect of rbd cache setting
>
> Hi All,
>
> We're testing the rbd cache setting for openstack(Ceph 14.2.5 Bluestore
> 3-replica), and an odd problem found:
>
> 1. Setting librbd cache
>
> [client]
>
> rbd cache = true
>
> rbd cache size = 16777216
>
> rbd cache max dirty = 12582912
>
> rbd cache target dirty = 8388608
>
> rbd cache max dirty age = 1
>
> rbd cache writethrough until flush = true
>
> 2. Running rbd bench
>
> rbd -c /etc/ceph/ceph.conf \
>           -k /etc/ceph/keyring2 \
>           -n client.rbd-openstack-002 bench \
>           --io-size 4K \
>           --io-threads 1 \
>           --io-pattern seq \
>           --io-type read \
>           --io-total 100G \
>           openstack-volumes/image-you-can-drop-me
> 3. Start another test
>
> rbd -c /etc/ceph/ceph.conf \
>
>       -k /etc/ceph/keyring2 \
>
>       -n client.rbd-openstack-002 bench \
>
>       --io-size 4K \
>
>       --io-threads 1 \
>
>       --io-pattern rand \
>
>       --io-type write \
>
>       --io-total 100G \
>
>       openstack-volumes/image-you-can-drop-me2
>
> Running for minutes, I found the read test almost hung for a while:
>
>      69    152069   2375.21  9728858.72
>      70    153627   2104.63  8620569.93
>      71    155748   1956.04  8011953.10
>      72    157665   1945.84  7970177.24
>      73    159661   1947.64  7977549.44
>      74    161522   1890.45  7743277.01
>      75    163583   1991.04  8155301.58
>      76    165791   2008.44  8226566.26
>      77    168433   2153.43  8820438.66
>      78    170269   2121.43  8689377.16
>      79    172511   2197.62  9001467.33
>      80    174845   2252.22  9225091.00
>      81    177089   2259.42  9254579.83
>      82    179675   2248.22  9208708.30
>      83    182053   2356.61  9652679.11
>      84    185087   2515.00  10301433.50
>      99    185345    550.16  2253434.96
>     101    185346    407.76  1670187.73
>     102    185348    282.44  1156878.38
>     103    185350    162.34  664931.53
>     104    185353     12.86  52681.27
>     105    185357      1.93   7916.89
>     106    185361      2.74  11235.38
>     107    185367      3.27  13379.95
>     108    185375      5.08  20794.43
>     109    185384      6.93  28365.91
>     110    185403      9.19  37650.06
>     111    185438     17.47  71544.17
>     128    185467      4.94  20243.53
>     129    185468      4.45  18210.82
>     131    185469      3.89  15928.44
>     132    185493      4.09  16764.16
>     133    185529      4.16  17037.21
>     134    185578     18.64  76329.67
>     135    185631     27.78  113768.65
>
> Why this happened? It's a unacceptable performance for read.
>
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx