Erasure-coded PG stuck in the failed_repair state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

We've got an outstanding issue with one of our Ceph clusters here at RAL. The cluster is 'Echo', our 40PB cluster. We found an object from an 8+3EC RGW pool in the failed_repair state. We aren't sure how the object got into this state, but it doesn't appear to be a case of correlated drive failure (the rest of the PG is fine). However, the detail of how we got into this state isn't our focus, it's how to get the PG back to a clean state.

The object (for our purposes, named OBJNAME) in question is from a RadosGW data pool. It presented initially as a PG in the failed_repair state. Repeated attempts to get the PG to repair failed. At this point we contacted the user who owns the data, and determined that the data in question was also stored elsewhere and so we could safely delete the object. We did that using radosgw-admin object rm OBJNAME, and confirmed that the object is gone with various approaches (radosgw-admin object stat, rados ls --pgid PGID | grep OBJNAME).

So far, so good. Except, even after the object was deleted and in spite of many instructions to repair, the placement group is still in the state active+clean+inconsistent+failed_repair, and the cluster won't go to HEALTH_OK. Here's what the log from one of these repair attempts looks like (from the log on the primary OSD).

2022-05-08 16:23:43.898 7f79d3872700  0 log_channel(cluster) log [DBG] : 11.2b5 repair starts
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 shard 1899(8) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 shard 1911(7) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 shard 2842(10) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 shard 3256(6) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 shard 3399(5) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 shard 3770(9) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 shard 5206(3) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 shard 6047(4) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 soid 11:ad45a433:::OBJNAME:head : failed to pick suitable object info
2022-05-08 19:03:12.690 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 repair 11 errors, 0 fixed

Looking for inconsistent objects in the PG doesn't report anything odd about this object (right now we get this rather odd output, but aren't sure that this isn't a red herring).

[root@ceph-adm1 ~]# rados list-inconsistent-obj 11.2b5
No scrub information available for pg 11.2b5
error 2: (2) No such file or directory

We don't get this output from this command on any other PG that we've tried.

So what next? To reiterate, this isn't about data recovery, it's about getting the cluster back to a healthy state. I should also note that this issue doesn't seem to be impacting the cluster beyond making that PG show as being in a bad state.

Rob Appleyard


This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses.

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux