fix active+clean+inconsistent on cephfs when digest != digest

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Glad you figured it out!

In the future you can also do repairs based on the underlying RADOS
objects. Generally speaking errors like this mean that the replicas
are storing objects that don't match, but if you go to each OSD
storing the object and find the raw file you will generally find that
two of them match and one doesn't. Simply replacing the bad object
with a copy of the one that matches and then running a scrub again
should mark it consistent.

I'm not sure why the pg repair didn't make them all consistent though...Sam?
-Greg

On Tue, May 19, 2015 at 11:16 AM, core <core at unixnews.ch> wrote:
> Hi list,
>
> I was struggeling quiet a while with the problem that on my cephfs data pool some PG?s stays inconsistent and could not be repaired. The message in OSD?s log was like
>
>
>>> repair 11.23a 57b4363a/20000015b67.000006e1/head//11 on disk data digest 0x325d0322 != 0xe8c0243
>
> and then the repair finished without fixing the error.
>
>
> After searching a long time I finally stumbled over this mail : http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-July/041618.html
> which helped to understand and solve the problem.
>
> It finally turned out to be an issue that can be easily remediated, so if others come to the same problem this procedure may help - if you have other suggestions on how to fix this issues please feel free to comment.
> Basically it?s only about identifying the problematic inode and remove it, means copy the file (which generates a new inode number) and remove the old one.
>
>
> My cluster :
>
> Hammer version: ceph version 0.94.1
> Nodes : 4
> OSD?s : 20 x 2TB
> problem Pool: cephfs data pool, size = 2
>
> The problem: during power outages all cluster nodes came back uncontrolled and disappeared again (flapping) so after a while everything got shuffled around and in normal conditions I would say it?s broken. Ceph survived ! cool :-)
>
> Anyway there were this ?active+clean+inconsistent? PG?s that even ?ceph pg repair <pgid>? could not fix anymore because the base digest was wrong.
>
> This procedure only works for CEPHFS data pools. So here is what I did :
>
> ?????????
>
> 1) check which pg?s are inconsistent
>> ceph health detail
>
> 2) check which osd?s are active for this PG's
>> ceph pg map <pgid>
>
> 3) check the osd?s log for repair errors to find the affected inode
>> grep <pgid> /var/log/ceph/*osd*log
>>> repair 11.23a 57b4363a/20000015b67.000006e1/head//11 on disk data digest 0x325d0322 != 0xe8c0243
>
> the first part is the inode number in hex : 20000015b67.000006e1 => this is the inode number : 20000015b67
>
> 4) having the hex inode number, convert it to integer to be passed to find :
>> (python) print int(?0x20000015b67?, 0)
>> (python) 2199023305503
>
> 5) now we have the inode number and can search on the cephfs for this inode number :
>> find /cephfs -inum 2199023305503
>
> 6) usually you?ll end up with 1 file that you can just copy like :
>> cp <original> <original_new>
>
> 7) remove the original file (and the inode)
>> rm <original> && mv <original_new> <original>
>
> 8) now run ceph pg repair once again
>> ceph pg repair <pgid>
>
> 9) the broken digest will be removed as the inode does not exist anymore and the inconsistency goes away.
>> problem solved :-)
>
>
> ?????????
>
> That?s all - cluster is clean again and I am quiet impressed about the stability (and for sure performance) and would like to thank the developers for this great piece of code ;)
>
> Thanks and best regards
>
> marco
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux