Re: The last 15 'degraded' items take as many hours as the first 15K?

"Harry G. Coin" <hgcoin@xxxxxxxxx> · Wed, 11 May 2022 18:04:30 -0500

It's a little four host, 4 OSD/host HDD cluster with a 5th doing the 
non-osd work.  Nearly entirely cephfs load.

On 5/11/22 17:47, Josh Baergen wrote:
Is this on SSD or HDD? RGW index, RBD, or ...? Those all change the
math on single-object recovery time.

Having said that...if the object is not huge and is not RGW index
omap, that slow of a single-object recovery would have me checking
whether I have a bad disk that's presenting itself as significantly
underperforming.

Josh

On Wed, May 11, 2022 at 4:03 PM Harry G. Coin <hgcoin@xxxxxxxxx> wrote:
Might someone explain why the count of degraded items can drop
thousands, sometimes tens of thousands in the same number of hours it
takes to go from 10 to 0?  For example, when an OSD or a host with a few
OSD's goes offline for a while, reboots.

Sitting at one complete and entire degraded object out of millions for
longer than it took to write this post.

Seems the fewer the number of degraded objects, the less interested the
cluster is in fixing it!

HC

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx