Re: rbd image journal performance

Jason Dillaman <jdillama@xxxxxxxxxx> · Sun, 18 Aug 2019 09:53:00 -0400

On Mon, Aug 12, 2019 at 10:03 PM yangjun@xxxxxxxxxxxxxxxxxxxx
<yangjun@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> Hi Jason,
>
> I was recently testing the RBD mirror feature(ceph12.2.8), my test environment is a single-node cluster, which including 10 3T hdd OSDs + 800G pcie ssd + bluestore, and the wal and db partition of the OSD is 30G.
> The test result of a 100G image is as follows：
>
>           disable journal           enable journal                   decline percentage
> iops：   1000                         877                                    12.3%
> bw：     402MB/s                   129MB/s                           67%
>
>
> Why does the bandwidth decline so much after starting journal of the RBD image?  I'm very appreciate if you could give me some suggestions for optimization. Thank you very much.

The use of the journal requires first writing to the journal and, once
committed, writing to the image (i.e. doubling the latency).
Therefore, the expected worst-case performance should be around 2x
slower [1]. There was a recent bug fix [2] in the master branch that
will be backported to older releases which greatly increases small IO
journal performance -- since it was nearly 10x slower due to the bug
instead of the expected 2x [3].

> ________________________________
> yangjun@xxxxxxxxxxxxxxxxxxxx

[1] https://www.slideshare.net/JasonDillaman/disaster-recovery-and-ceph-block-storage-introducing-multisite-mirroring/17
[2] https://tracker.ceph.com/issues/40072
[3] https://youtu.be/ZifNGprBUTA?t=1687

-- 
Jason
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx