Re: Erasure-coded PG stuck in the failed_repair state

Robert Appleyard - STFC UKRI <rob.appleyard@xxxxxxxxxx> · Wed, 11 May 2022 15:29:39 +0000

Hi,

Thanks for your reply, we have let it scrub and it’s still active+clean+inconsistent+failed_repair and we still get the same error:

[root@ceph-adm1 ~]# rados list-inconsistent-obj 11.2b5
No scrub information available for pg 11.2b5
error 2: (2) No such file or directory

My suspicion is that this particular error message is a red herring or at least a symptom far downstream of the bigger issue we have (the PG still stuck in failed_repair), but it would be nice to understand it all the same.

Rob

From: Wesley Dillingham <wes@xxxxxxxxxxxxxxxxx>
Sent: 10 May 2022 14:48
To: Appleyard, Robert (STFC,RAL,SC) <rob.appleyard@xxxxxxxxxx>
Cc: ceph-users <ceph-users@xxxxxxx>
Subject: Re:  Erasure-coded PG stuck in the failed_repair state

In my experience:

"No scrub information available for pg 11.2b5
error 2: (2) No such file or directory"

is the output you get from the command when the up or acting osd set has changed since the last deep-scrub. Have you tried to run a deep scrub (ceph pg deep-scrub 11.2b5) on the pg and then try "rados list-inconsistent-obj 11.2b5" again. I do recognize that part of the pg repair also performs a deep-scrub but perhaps the deep-scrub alone will help with your attempt to run rados list-inconsistent-obj.

Respectfully,

Wes Dillingham
wes@xxxxxxxxxxxxxxxxx<mailto:wes@xxxxxxxxxxxxxxxxx>
LinkedIn<http://www.linkedin.com/in/wesleydillingham>

On Tue, May 10, 2022 at 8:52 AM Robert Appleyard - STFC UKRI <rob.appleyard@xxxxxxxxxx<mailto:rob.appleyard@xxxxxxxxxx>> wrote:
Hi,

We've got an outstanding issue with one of our Ceph clusters here at RAL. The cluster is 'Echo', our 40PB cluster. We found an object from an 8+3EC RGW pool in the failed_repair state. We aren't sure how the object got into this state, but it doesn't appear to be a case of correlated drive failure (the rest of the PG is fine). However, the detail of how we got into this state isn't our focus, it's how to get the PG back to a clean state.

The object (for our purposes, named OBJNAME) in question is from a RadosGW data pool. It presented initially as a PG in the failed_repair state. Repeated attempts to get the PG to repair failed. At this point we contacted the user who owns the data, and determined that the data in question was also stored elsewhere and so we could safely delete the object. We did that using radosgw-admin object rm OBJNAME, and confirmed that the object is gone with various approaches (radosgw-admin object stat, rados ls --pgid PGID | grep OBJNAME).

So far, so good. Except, even after the object was deleted and in spite of many instructions to repair, the placement group is still in the state active+clean+inconsistent+failed_repair, and the cluster won't go to HEALTH_OK. Here's what the log from one of these repair attempts looks like (from the log on the primary OSD).

2022-05-08 16:23:43.898 7f79d3872700  0 log_channel(cluster) log [DBG] : 11.2b5 repair starts
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 shard 1899(8) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 shard 1911(7) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 shard 2842(10) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 shard 3256(6) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 shard 3399(5) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 shard 3770(9) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 shard 5206(3) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 shard 6047(4) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 soid 11:ad45a433:::OBJNAME:head : failed to pick suitable object info
2022-05-08 19:03:12.690 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 repair 11 errors, 0 fixed

Looking for inconsistent objects in the PG doesn't report anything odd about this object (right now we get this rather odd output, but aren't sure that this isn't a red herring).

[root@ceph-adm1 ~]# rados list-inconsistent-obj 11.2b5
No scrub information available for pg 11.2b5
error 2: (2) No such file or directory

We don't get this output from this command on any other PG that we've tried.

So what next? To reiterate, this isn't about data recovery, it's about getting the cluster back to a healthy state. I should also note that this issue doesn't seem to be impacting the cluster beyond making that PG show as being in a bad state.

Rob Appleyard

This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses.

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx