Re: RBD with PWL cache shows poor performance compared to cache device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 30 Jun 2023 at 08:50, Yin, Congmin <congmin.yin@xxxxxxxxx> wrote:
>
> Hi Matthew,
>
> Due to the latency of rbd layers, the write latency of the pwl cache is more than ten times that of the Raw device.
> I replied directly below the 2 questions.
>
> Best regards.
> Congmin Yin
>
>
> -----Original Message-----
> From: Matthew Booth <mbooth@xxxxxxxxxx>
> Sent: Thursday, June 29, 2023 7:23 PM
> To: Ilya Dryomov <idryomov@xxxxxxxxxx>
> Cc: Giulio Fidente <gfidente@xxxxxxxxxx>; Yin, Congmin <congmin.yin@xxxxxxxxx>; Tang, Guifeng <guifeng.tang@xxxxxxxxx>; Vikhyat Umrao <vumrao@xxxxxxxxxx>; Jdurgin <Jdurgin@xxxxxxxxxx>; John Fulton <johfulto@xxxxxxxxxx>; Francesco Pantano <fpantano@xxxxxxxxxx>; ceph-users@xxxxxxx
> Subject: Re:  RBD with PWL cache shows poor performance compared to cache device
>
> On Wed, 28 Jun 2023 at 22:44, Ilya Dryomov <idryomov@xxxxxxxxxx> wrote:
> >> ** TL;DR
> >>
> >> In testing, the write latency performance of a PWL-cache backed RBD
> >> disk was 2 orders of magnitude worse than the disk holding the PWL
> >> cache.
>
>
>
> PWL cache can use pmem or SSD as cache devices. Using PMEM, based on my test environment at that time, I can give specific data as follows: the write latency of the pmem Raw device is about 10+us, the write latency of the pwl cache is about 100us+(from the latency of the rbd layers), and the write latency of the ceph cluster is about 1000+us(from messengers and network). But for SSDs, there are many types, and I cannot provide a specific value, but it will definitely be worse than pmem. So, for a phenomenon that is 2 orders of magnitude lower, it is worse than expected. Can you provide detailed values of the three for analysis. (SSD, pwl cache, ceph cluster)

I'm not entirely sure what you're asking for. Which values are you looking for?

I did provide 3 sets of test results below, is that what you mean?
* rbd no cache: 1417216 ns
* pwl cache device: 44288 ns
* rbd with pwl cache: 5210112 ns

These are all outputs from the benchmarking test. The first is
executing in the VM writing to a ceph RBD disk *without* PWL. The
second is executing on the host writing directly to the SSD which is
being used for the PWL cache. The third is execuing in the VM writing
to the same ceph RBD disk, but this time *with* PWL.

Incidentally, the client and server machines are identical, and the
SSD used by the client for PWL is the same model used on the server as
the OSDs. The SSDs are SAMSUNG MZ7KH480HAHQ0D3 SSDs attached to PERC
H730P Mini (Embedded).

> ==============================================================
>
> >>
> >> ** Summary
> >>
> >> I was hoping that PWL cache might be a good solution to the problem
> >> of write latency requirements of etcd when running a kubernetes
> >> control plane on ceph. Etcd is extremely write latency sensitive and
> >> becomes unstable if write latency is too high. The etcd workload can
> >> be characterised by very small (~4k) writes with a queue depth of 1.
> >> Throughput, even on a busy system, is normally very low. As etcd is
> >> distributed and can safely handle the loss of un-flushed data from a
> >> single node, a local ssd PWL cache for etcd looked like an ideal
> >> solution.
> >
> >
> > Right, this is exactly the use case that the PWL cache is supposed to address.
>
> Good to know!
>
> >> My expectation was that adding a PWL cache on a local SSD to an
> >> RBD-backed would improve write latency to something approaching the
> >> write latency performance of the local SSD. However, in my testing
> >> adding a PWL cache to an rbd-backed VM increased write latency by
> >> approximately 4x over not using a PWL cache. This was over 100x more
> >> than the write latency performance of the underlying SSD.
>
>
>
>
> When using image as the VM's disk, you may have used commands like the following. In many cases, using parameters such as writeback will force the start of rbd cache, which is a memory cache. It is normal for pwl cache to be several times slower than it. Please confirm.
> There is currently no parameter support for using only pwl cache instead of rbd cache. I have tested the latency of using pwl cache (pmem) by modifying the code myself, which is about twice as high as using rbd cache.
>
> qemu -m 1024 -drive format=raw,file=rbd:data/squeeze:rbd_cache=true,cache=writeback

I created the rbd disk by first installing the VM on a local qcow2
file, then copying the data from the qcow2 to rbd, converting to raw.
The command I used was:

`qemu-img convert -f qcow2 -O raw
/var/lib/libvirt/images/pwl-test.qcow2
rbd:libvirt-pool/pwl-test:id=libvirt`

I am configuring rbd options from the server by setting options on the
pool. I have been confirming that options are being set correctly with
`rbd status libvirt-pool/pwl-test` on the server.

The latest set of profiling data requested by Mark were generated
entirely with `rbd_cache=false`:
https://gist.github.com/mdbooth/2d68b7e081a37e27b78fe396d771427d
-- 
Matthew Booth
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux