Re: RBD with PWL cache shows poor performance compared to cache device

Matthew Booth <mbooth@xxxxxxxxxx> · Mon, 10 Jul 2023 17:19:40 +0100

On Thu, 6 Jul 2023 at 12:54, Mark Nelson <mark.nelson@xxxxxxxxx> wrote:
>
>
> On 7/6/23 06:02, Matthew Booth wrote:
> > On Wed, 5 Jul 2023 at 15:18, Mark Nelson <mark.nelson@xxxxxxxxx> wrote:
> >> I'm sort of amazed that it gave you symbols without the debuginfo
> >> packages installed.  I'll need to figure out a way to prevent that.
> >> Having said that, your new traces look more accurate to me.  The thing
> >> that sticks out to me is the (slight?) amount of contention on the PWL
> >> m_lock in dispatch_deferred_writes, update_root_scheduled_ops,
> >> append_ops, append_sync_point(), etc.
> >>
> >> I don't know if the contention around the m_lock is enough to cause an
> >> increase in 99% tail latency from 1.4ms to 5.2ms, but it's the first
> >> thing that jumps out at me.  There appears to be a large number of
> >> threads (each tp_pwl thread, the io_context_pool threads, the qemu
> >> thread, and the bstore_aio thread) that all appear to have potential to
> >> contend on that lock.  You could try dropping the number of tp_pwl
> >> threads from 4 to 1 and see if that changes anything.
> > Will do. Any idea how to do that? I don't see an obvious rbd config option.
> >
> > Thanks for looking into this,
> > Matt
>
> you thanked me too soon...it appears to be hard-coded in, so you'll have
> to do a custom build. :D
>
> https://github.com/ceph/ceph/blob/main/src/librbd/cache/pwl/AbstractWriteLog.cc#L55-L56

Just to update: I have managed to test this today and it made no difference :(

In general, though, unless it's something egregious are we really
looking for something CPU-bound? Writes are 2 orders of magnitude
slower than the underlying local disk. This has to be caused by
something wildly inefficient.

I have had a thought: the guest filesystem has 512 byte blocks, but
the pwl filesystem has 4k blocks (on a 4k disk). Given that the test
is of small writes, is there any chance that we're multiplying the
number of physical writes in some pathological manner?

Matt
-- 
Matthew Booth
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx