Re: The last 15 'degraded' items take as many hours as the first 15K?

Stefan Kooman <stefan@xxxxxx> · Fri, 13 May 2022 08:56:08 +0200

On 5/12/22 18:02, Harry G. Coin wrote:

Thanks Janne and all for the insights!  The reason why I half-jokingly 
suggested the cluster 'lost interest' in those last few fixes is that 
the recovery statistics' included in ceph -s reported near to zero 
activity for so long.  After a long while those last few 'were fixed' 
--- but if the cluster was moving metadata around to fix the 'holdout 
repairs' that traffic wasn't in the stats.  Those last few objects/pgs 
to be repaired seemingly got fixed 'by magic that didn't include moving 
data counted in the ceph -s stats'.

It's probably the OMAP data (lots of key-values) that takes a lot of 
time to replicate (We have PGs with over 4 million of objects with just 
OMAP) and those can take up to 45 minutes to recover all while doing a 
little bit of network throughput (those are NVMe OSDs). You can check 
this with "watch -n 3 ceph pg ls remapped" and see how long each 
backfill takes. And also if it has a lot of OMAP_BYTES and OMAP_KEYS ... 
but no "BYTES".

Gr. Stefan
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx