Re: crimson-osd vs legacy-osd: should the perf difference be already noticeable?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Roman,

On Thu, Jan 9, 2020 at 2:51 PM Roman Penyaev <rpenyaev@xxxxxxx> wrote:
> First thing that catches my eye is that for small blocks there is no big
> difference at all, but as the block increases, crimsons iops starts to
> decline. Can it be the transport issue? Can be tested as well.

This is a known issue with the Seastar's POSIX-based network
stack. As Kefu pointed out, even large payloads are retrieved from
kernel with multiple small, fixed-size chunks. That's a matter of how
the internal interfaces were shaped. My personal impression is their
design favors the native stack / DPDK while avoiding differentiated
behavior among the stacks (likely to not surprise developers).
Moreover, crimson-osd imposes on Seastar additional memcpy to
reconcile those tiny chunks into a flat buffer.

Here is a more detailed gist:
  https://gist.github.com/rzarzynski/a1d67dc39b0ef4d49cb522179b1f3c89.

There are branches (for both Seastar and crimson) with PoC for
the "input buffer factory" that targets those issues. Performance
comparison is here:
  https://gist.github.com/rzarzynski/ad0aaa80b26603bc1a803ce0d209ac87.

Also, when narrowing the comparison to async-msgr vs crimson-msgr
(with ibf) I wouldn't expect too much of a difference. In read tests we're
observing pretty similar IPC for both crimson-osd and msgr-worker-n
(single thread profiling). The thing that might change a lot is the native
stack. In Intel's testing it significantly (up to 30-40% IIRC) improved
IPC of crimson-msgr. Glued with good SPDK support in Seastore
it might draw the POSIX stack (and thus the need for ibf) a bit obsolete.

Quick note on the saturation: please be careful when judging it with
top, pidstat or even perf stat. In contrast to the legacy OSD, Seastar
does busy-wait for awhile. This greatly exaggerates the CPU utilisation
for modest workloads. And yes, **crimson is all about the computational
efficiency**. We're much more interested in cycles/op than in raw IOPS,
to be honest.

Regards,
Radek
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx



[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux