Thanks Sam for the confirmation. Thanks, Guang On Mon, Jan 4, 2016 at 3:59 PM, Samuel Just <sjust@xxxxxxxxxx> wrote: > IIRC, you are running giant. I think that's the log rotate dangling > fd bug (not fixed in giant since giant is eol). Fixed upstream > 8778ab3a1ced7fab07662248af0c773df759653d, firefly backport is > b8e3f6e190809febf80af66415862e7c7e415214. > -Sam > > On Mon, Jan 4, 2016 at 3:37 PM, Guang Yang <guangyy@xxxxxxxxx> wrote: >> Hi Cephers, >> Before I open a tracker, I would like check if it is a known issue or not.. >> >> One one of our clusters, there was OSD crash during repairing, the >> crash happened after we issued a PG repair for inconsistent PGs, which >> failed because the recorded file size (within xattr) mismatched with >> the actual file size. >> >> The mismatch was caused by the fact that the content of the data file >> are OSD logs, following is from osd.354 on c003: >> >> -rw-r--r-- 1 yahoo root 75168 Jan 3 07:30 >> default.12061.9\u8396947527\u52ac8b3ec6\uo.jpg__head_A2478171__3_ffffffffffffffff_7 >> -bash-4.1$ head >> "default.12061.9\u8396947527\u52ac8b3ec6\uo.jpg__head_A2478171__3_ffffffffffffffff_7" >> 2016-01-03 07:30:01.600119 7f7fe2096700 15 >> filestore(/home/y/var/lib/ceph/osd/ceph-354) getattrs >> 3.171s7_head/a2478171/default.12061.9_8396947527_52ac8b3ec6_o.jpg/head//3/18446744073709551615/7 >> 2016-01-03 07:30:01.604967 7f7fe2096700 10 >> filestore(/home/y/var/lib/ceph/osd/ceph-354) -ERANGE, len is 494 >> 2016-01-03 07:30:01.604984 7f7fe2096700 10 >> filestore(/home/y/var/lib/ceph/osd/ceph-354) -ERANGE, got 247 >> 2016-01-03 07:30:01.604986 7f7fe2096700 20 >> filestore(/home/y/var/lib/ceph/osd/ceph-354) fgetattrs 61 getting >> '_user.rgw.idtag' >> 2016-01-03 07:30:01.604996 7f7fe2096700 20 >> filestore(/home/y/var/lib/ceph/osd/ceph-354) fgetattrs 61 getting '_' >> 2016-01-03 07:30:01.605007 7f7fe2096700 20 >> filestore(/home/y/var/lib/ceph/osd/ceph-354) fgetattrs 61 getting >> 'snapset' >> 2016-01-03 07:30:01.605013 7f7fe2096700 20 >> filestore(/home/y/var/lib/ceph/osd/ceph-354) fgetattrs 61 getting >> '_user.rgw.manifest' >> 2016-01-03 07:30:01.605026 7f7fe2096700 20 >> filestore(/home/y/var/lib/ceph/osd/ceph-354) fgetattrs 61 getting >> 'hinfo_key' >> 2016-01-03 07:30:01.605042 7f7fe2096700 20 >> filestore(/home/y/var/lib/ceph/osd/ceph-354) fgetattrs 61 getting >> '_user.rgw.x-amz-meta-origin' >> 2016-01-03 07:30:01.605049 7f7fe2096700 20 >> filestore(/home/y/var/lib/ceph/osd/ceph-354) fgetattrs 61 getting >> '_user.rgw.acl' >> >> >> This only happens on the clusters we turned on the verbose log >> (debug_osd/filestore=20). And we are running ceph v0.87. >> >> Thanks, >> Guang >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html