Re: cephfs metadata pool: deep-scrub error "omap_digest != best guess omap_digest"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Brad...

Thanks for the feedback. I think we are making some progress.

I have opened the following tracker issue: http://tracker.ceph.com/issues/17177 . 

There I give pointers for all the logs, namely the result of the pg query and all osd logs after increasing the log levels (debug_ms=1, debug_filestore=20 and debug_osd=30) during a manual deep-scrub operation of the inconsistent pg (which by the way went fine).

Regarding the question why this is happening, I do not know. We are running the same version everywhere (including when the server hosting osd.78 was included back in production). We never saw this in infernalis, and since we upgraded to Jewel, it already happened more than once. Another reason why we could be seeing the issue just now is because, only in Jewel, we are massively increasing the number of osd servers. In Infernalis the setup was quite stable during the whole time. 

Regarding understanding which OSD has the bad data, I think that we have enough evidence to say that it is the primary (78), i.e.:
- the affected object in the peers has the oldest (and same) timestap, 
- the pg migrated recently to osd.78, previous deep scrubs (prior to osd.78 becoming the primary) went ok, and  the information you pointed out in the pg query result seems to point to inconsistencies between the peers and the primary at the time osd.78 becomes the primary .

Also, after diving into the logs of the manual deep scrub, I found the following ERANGE message in the peers osd logs but nothing in the primary osd log. This message is spitted out after a getattrs operation on the object. The relevant extract of the logs for all osds follows after the email.

2016-08-31 00:55:01.444953 7f2dcbaaa700 10 filestore(/var/lib/ceph/osd/ceph-49)  -ERANGE, len is 208
2016-08-31 00:55:01.444964 7f2dcbaaa700 10 filestore(/var/lib/ceph/osd/ceph-49)  -ERANGE, got 104

So it seems the problem may rely on the extended attributes of the object which was not replicated properly.

Now that I (think) I know that the primary is wrong, I do not want to use a blind  'ceph repair'. However, this raises another question: Can I simply manually delete the problematic object in osd.78 and trigger a ceph repair afterwards (as described here: http://ceph.com/planet/ceph-manually-repair-object/ )?  Since we are talking about cephfs metadata pool, producing 0 size objects and with a heavy use of omap information, I am just wondering if that procedure should be the same in this case.

Cheers
Goncalo


=======

PRIMARY OSD 78:

2016-08-31 00:55:01.404186 7f8b2f8f6700 10 filestore(/var/lib/ceph/osd/ceph-78) stat 5.3d0_head/#5:0bd6d154:::602.00000000:head# = 0 (size 0)
2016-08-31 00:55:01.404194 7f8b2f8f6700 15 filestore(/var/lib/ceph/osd/ceph-78) getattrs 5.3d0_head/#5:0bd6d154:::602.00000000:head#
2016-08-31 00:55:01.404274 7f8b2f8f6700 20 filestore(/var/lib/ceph/osd/ceph-78) fgetattrs 394 getting '_'
2016-08-31 00:55:01.404292 7f8b2f8f6700 20 filestore(/var/lib/ceph/osd/ceph-78) fgetattrs 394 getting '_parent'
2016-08-31 00:55:01.404302 7f8b2f8f6700 20 filestore(/var/lib/ceph/osd/ceph-78) fgetattrs 394 getting 'snapset'
2016-08-31 00:55:01.404309 7f8b2f8f6700 20 filestore(/var/lib/ceph/osd/ceph-78) fgetattrs 394 getting '_layout'
2016-08-31 00:55:01.404316 7f8b2f8f6700 10 filestore(/var/lib/ceph/osd/ceph-78) getattrs no xattr exists in object_map r = 0
2016-08-31 00:55:01.404319 7f8b2f8f6700 10 filestore(/var/lib/ceph/osd/ceph-78) getattrs 5.3d0_head/#5:0bd6d154:::602.00000000:head# = 0
2016-08-31 00:55:01.404358 7f8b2f8f6700 10 osd.78 pg_epoch: 23099 pg[5.3d0( v 23099'104738 (23099'101639,23099'104738] local-les=22440 n=257 ec=339 les/c/f 22440/22440/0 19928/22439/22439) [78,59,49] r=0 lpr=22439 crt=23099'104736 lcod 23099'104737 mlcod 23099'104737 active+clean+scrubbing+deep+inconsistent] be_deep_scrub 5:0bd6d154:::602.00000000:head seed 4294967295

--- * ---

PEER OSD 49
2016-08-31 00:55:01.444902 7f2dcbaaa700 10 filestore(/var/lib/ceph/osd/ceph-49) stat 5.3d0_head/#5:0bd6d154:::602.00000000:head# = 0 (size 0)
2016-08-31 00:55:01.444909 7f2dcbaaa700 15 filestore(/var/lib/ceph/osd/ceph-49) getattrs 5.3d0_head/#5:0bd6d154:::602.00000000:head#
2016-08-31 00:55:01.444953 7f2dcbaaa700 10 filestore(/var/lib/ceph/osd/ceph-49)  -ERANGE, len is 208
2016-08-31 00:55:01.444964 7f2dcbaaa700 10 filestore(/var/lib/ceph/osd/ceph-49)  -ERANGE, got 104
2016-08-31 00:55:01.444967 7f2dcbaaa700 20 filestore(/var/lib/ceph/osd/ceph-49) fgetattrs 315 getting '_'
2016-08-31 00:55:01.444974 7f2dcbaaa700 20 filestore(/var/lib/ceph/osd/ceph-49) fgetattrs 315 getting '_parent'
2016-08-31 00:55:01.444980 7f2dcbaaa700 20 filestore(/var/lib/ceph/osd/ceph-49) fgetattrs 315 getting 'snapset'
2016-08-31 00:55:01.444986 7f2dcbaaa700 20 filestore(/var/lib/ceph/osd/ceph-49) fgetattrs 315 getting '_layout'
2016-08-31 00:55:01.444992 7f2dcbaaa700 10 filestore(/var/lib/ceph/osd/ceph-49) getattrs no xattr exists in object_map r = 0
2016-08-31 00:55:01.444994 7f2dcbaaa700 10 filestore(/var/lib/ceph/osd/ceph-49) getattrs 5.3d0_head/#5:0bd6d154:::602.00000000:head# = 0
2016-08-31 00:55:01.444998 7f2dcbaaa700 10 osd.49 pg_epoch: 23099 pg[5.3d0( v 23099'104738 (23099'101639,23099'104738] local-les=22440 n=257 ec=339 les/c/f 22440/22440/0 19928/22439/22439) [78,59,49] r=2 lpr=22439 pi=4173-22438/25 luod=0'0 crt=23099'104736 lcod 23099'104737 active] be_deep_scrub 5:0bd6d154:::602.00000000:head seed 4294967295

--- * ---

PEER OSD 59

2016-08-31 00:55:01.417801 7f335510b700 10 filestore(/var/lib/ceph/osd/ceph-59) stat 5.3d0_head/#5:0bd6d154:::602.00000000:head# = 0 (size 0)
2016-08-31 00:55:01.417806 7f335510b700 15 filestore(/var/lib/ceph/osd/ceph-59) getattrs 5.3d0_head/#5:0bd6d154:::602.00000000:head#
2016-08-31 00:55:01.417836 7f335510b700 10 filestore(/var/lib/ceph/osd/ceph-59)  -ERANGE, len is 208
2016-08-31 00:55:01.417843 7f335510b700 10 filestore(/var/lib/ceph/osd/ceph-59)  -ERANGE, got 104
2016-08-31 00:55:01.417845 7f335510b700 20 filestore(/var/lib/ceph/osd/ceph-59) fgetattrs 473 getting '_'
2016-08-31 00:55:01.417850 7f335510b700 20 filestore(/var/lib/ceph/osd/ceph-59) fgetattrs 473 getting '_parent'
2016-08-31 00:55:01.417856 7f335510b700 20 filestore(/var/lib/ceph/osd/ceph-59) fgetattrs 473 getting 'snapset'
2016-08-31 00:55:01.417861 7f335510b700 20 filestore(/var/lib/ceph/osd/ceph-59) fgetattrs 473 getting '_layout'
2016-08-31 00:55:01.417866 7f335510b700 10 filestore(/var/lib/ceph/osd/ceph-59) getattrs no xattr exists in object_map r = 0
2016-08-31 00:55:01.417867 7f335510b700 10 filestore(/var/lib/ceph/osd/ceph-59) getattrs 5.3d0_head/#5:0bd6d154:::602.00000000:head# = 0
2016-08-31 00:55:01.417870 7f335510b700 10 osd.59 pg_epoch: 23099 pg[5.3d0( v 23099'104738 (23099'101639,23099'104738] local-les=22440 n=257 ec=339 les/c/f 22440/22440/0 19928/22439/22439) [78,59,49] r=1 lpr=22439 pi=19928-22438/1 luod=0'0 crt=23099'104736 lcod 23099'104737 active] be_deep_scrub 5:0bd6d154:::602.00000000:head seed 4294967295

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux