Backfill Performance for

Jonathan Suever <suever@xxxxxxxxx> · Thu, 3 Aug 2023 14:11:05 -0400

I am in the process of expanding our cluster capacity by ~50% and have
noticed some unexpected behavior during the backfill and recovery process
that I'd like to understand and see if there is a better configuration that
will yield a faster and smoother backfill.

Pool Information:

OSDs: 243 spinning HDDs
PGs: 1024 (yes, this is low for our number of disks)

I inherited the cluster and it has the following settings which seem to
have been done in an attempt to get the cluster to recover quickly:

osd_max_backfills: 6 (default is 1)
osd_recovery_sleep_hdd: 0.0 (default is 0.1)
osd_recovery_max_active_hdd: 9

When watching the PGs recover I am noticing a few things:

- All PGs seem to be backfilling at the same time which seems to be in
violation of osd_max_backfills. I understand that there should be 6 readers
and 6 writers at a time, but I'm seeing a given OSD participate in more
than 6 PG backfills. Is an OSD only considered as backfilling if it is not
present in both the UP and ACTING groups (e.g. it will have it's data
altered)?

- Some PGs are recovering at a much slower rate than others (some as little
as kilobytes per second) despite the disks being all of a similar speed. Is
there some way to dig into why that may be?

- In general, the recovery is happening very slowly (between 1 and 5
objects per second per PG). Is it possible the settings above are too
aggressive and causing performance degradation due to disk thrashing?

- Currently, all misplaced PGs are backfilling, if I were to change some of
the settings above (specifically `osd_max_backfills`) would that
essentially pause backfilling PGs or will those backfills have to end and
then start over when it is done waiting?

- Given that all PGs are backfilling simultaneously there is no way to
prioritize one PG over another (we have some disks with very high usage
that we're trying to reduce). Would reducing those max backfills allow for
proper prioritization of PGs with force-backfill?

- We have had some OSDs restart during the process and their misplaced
object count is now zero but they are incrementing their recovering objects
bytes. Is that expected and is there a way to estimate when that will
complete?

Thanks for the help!

-Jonathan
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx