Re: PG in active+clean+inconsistent, but list-inconsistent-obj doesn't show it

Ronny Aasen <ronny+ceph-users@xxxxxxxx> · Thu, 28 Sep 2017 10:40:32 +0200

On 28. sep. 2017 09:27, Olivier Migeot wrote:
Greetings,

we're in the process of recovering a cluster after an electrical 
disaster. Didn't work bad so far, we managed to clear most of errors. 
All that prevents return to HEALTH_OK now is a bunch (6) of scrub 
errors, apparently from a PG that's marked as active+clean+inconsistent.

Thing is, rados list-inconsistent-obj doesn't return anything but an 
empty list (plus, in the most recent attempts : error 2: (2) No such 
file or directory)

We're on Jewel (waiting for this to be fixed before planning upgrade), 
and the pool our PG belongs to has a replica of 2.

No success with ceph pg repair, and I already tried to remove and import 
the most recent version of said PG in both its acting OSDs : it doesn't 
change a thing.

Is there anything else I could try?

Thanks,

size=2 is ofcourse horrible, and I assume you know that...  But even 
more important:  I hope you have min_size=2 so you avoid generating more 
problems in the future, or while troubleshooting.
!

first of all, read this link a few times:
http://ceph.com/geen-categorie/ceph-manually-repair-object/

you need to locate the bad objects to fix them. since
rados list-inconsistent-obj does not work you need to manualy check the 
logs of the osd's that are participating in the pg in question. grep for 
ERR,

once you find the name of the object with problem, you need to locate 
the object using find /path/of/pg -name 'objectname'

once you have the objectpath you need to compare the 2 objects and find 
out what object is the bad one, this is where 3 replication would have 
helped, since when one is bad, how do you know the bad from the good...

the error message in the log may give hints to the error. read and 
understand what the error message is, since it is critical to 
understanding what is wrong with the object.

the object type also helps when determining the wrong one. is it a rados 
object, a rbd block or a cephfs metadata og data object. knowing what it 
should be helps determining the wrong one.

things to try:
ls -lh $path ; compare metadata are there obvious problems?  refer to 
the error in the log.
- one have size 0 and there should have been a size?
- one have size greater then 0 and it should have been size 0?
- one is significantly larger then the other, perhaps one is truncated? 
perhaps one have garbage added.

md5sum $path
- perhaps a block have read error, it would show on this command. and be 
a dead giveaway to the problem object.
- compare checksum.  do you know what the object  should have as sum?

actualy look at the object. use strings or hexdump to try to determine 
the contents, vs what the object should contain.

if you can  locate the bad object. then stop the osd. flush it's 
journal. move away the bad object, (i just mv it to somewhere else).
restart the osd.

run repair on the pg, tail  the logs and wait for the repair and scrub 
to finish.

--

if you are unable to determine the good object from the bad. You can try 
to determine what file it refers to in cephfs, or what block it refers 
to in rbd.  and by overwriting that file or block in cephfs or rbd you 
can indirectly overwrite both objects with new data.

if this is a rbd you should run a filesystem check on the fs on that rbd 
after all the ceph problems are repaired.

good luck
Ronny Aasen

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com