Re: Slow recovery on Quincy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



As someone in this thread noted, the cost related config options are
removed in the next version (ceph-17.2.6-45.el9cp).
The cost parameters may not work in all cases due to the inherent
differences in the underlying device types and other
external factors.

With the endeavor to achieve a more hands free operation, there are
significant improvements in the next version
that should help resolve issues described in this thread. I am listing the
significant ones below:

1. The mClock profile QoS parameters (reservation and limit) are now
simplified and are specified in terms of a fraction of the OSD's IOPS
capacity
    rather than in terms of IOPS.
2. The default mclock profile is now changed to the 'balanced' profile
which gives equal priority to client and background OSD operations.
3. The allocations for different types of OSD ops within mClock
profiles are changed.
4. The earlier cost related config parameters are removed. The cost is
now determined by the OSD based on the underlying device
characteristics, i.e.,
     its 4 KiB random IOPS capacity and the device's max sequential bandwidth.
      -  The random IOPS capacity is determined using 'osd bench' as
before, but now based on the result, unrealistic values are not
considered and
          reasonable defaults are used if the measurement crosses a
threshold governed by *osd_mclock_iops_capacity_threshold_[hdd**|ssd].
*The
          default IOPS capacity may be overridden by users if not
accurate, The thresholds too are configurable. The max sequential
bandwidth is
          defined by *osd_mclock_max_sequential_bandwidth_[hdd|ssd*],
and are set to reasonable defaults. Again, these may be modified if
not accurate.
          Therefore, these changes account for inaccuracies and
provide good control to the user in terms of specifying accurate OSD
characteristics.
5. Degraded recoveries are given more priority than backfills.
Therefore, you may observe faster degraded recovery rates compared to
    backfills with the 'balanced' and 'high_client_ops' profile. But
the backfills will still show healthy rates when compared to the slow
backfill rates
    mentioned in this thread. For faster recovery and backfills, the
'high_recovery_ops' profile with modified QoS parameters would help.

Please see the latest upstream documentation for more details:
https://docs.ceph.com/en/quincy/rados/configuration/mclock-config-ref/


The recommendation is to upgrade when feasible and provide your
feedback, questions and suggestions.

-Sridhar
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux