Re: Recent ceph.io Performance Blog Posts

Mark Nelson <mnelson@xxxxxxxxxx> · Wed, 9 Nov 2022 07:30:43 -0600

On 11/9/22 4:48 AM, Stefan Kooman wrote:

On 11/8/22 21:20, Mark Nelson wrote:
Hi Folks,

I thought I would mention that I've released a couple of performance 
articles on the Ceph blog recently that might be of interest to people:

For sure, thanks a lot, it's really informative!

Can we also ask for special requests? One of the things that would 
help us (and CephFS users in general) is how performance of CephFS for 
small files (~512 bytes, 2k up to say 64K) is impacted by the amount 
of PGs a CephFS metadata pool has.

That's an interesting question.  I wouldn't really expect the metadata 
pool PG count to have a dramatic effect here at counts that result in 
reasonable pseudo-random distribution.  Have you seen otherwise?

Question that might be answered:

- does it help to provision more PGs for workloads that rely heavily 
on OMAP usage by the MDS (or is RocksDB the bottleneck in all cases)?

Tests that might be useful:

- rsync (single threaded, worst case)
- fio random read / write tests with varying io depths and threads
- The CephFS devs might know some performance tests in this context

FWIW I wrote the libcephfs backend code for the IOR and mdtest 
benchmarks used in the IO500 test suite.  Typically I've seen that 
libcephfs and kernel cephfs are competitive with RBD for small random 
writes over a small file set.  It's when you balloon to huge numbers of 
directories/files that CephFS can have problems with the way dirfrags 
are distributed across active MDSes. Directory pinning can help here if 
you have files nicely distributed across lots of directories.  If you 
have a lot of files in a single directory it can become a problem.

One of the tricky things with doing these benchmarks, is that the PG 
placement over the OSDs might heavily impact performance all by 
itself, as primary PGs are not placed in the same way with different 
amount of PGs in the pool. Therefore, ideally, the primaries are 
balanced as evenly possible. I'm eagerly awaiting the Ceph virtual 
2022 talk "New workload balancer in Ceph". Having the primaries 
balanced before these benchmarks run seems to be a prerequisite to do 
a "apples to apples" comparison.

There can be an effect to having poor primary distributions across OSDs, 
but typically it's been subtle in my experience at moderately high PG 
counts.  The balancer work is certainly interesting though, especially 
when can't have or don't want a lot of PGs.

Gr. Stefan

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx