On Tue, Dec 23, 2014 at 4:17 AM, Samuel Just <sam.just@xxxxxxxxxxx> wrote: > Oh, that's a bit less interesting. The bug might be still around though. > -Sam > > On Mon, Dec 22, 2014 at 2:50 PM, Andrey Korolyov <andrey@xxxxxxx> wrote: >> On Tue, Dec 23, 2014 at 1:12 AM, Samuel Just <sam.just@xxxxxxxxxxx> wrote: >>> You'll have to reproduce with logs on all three nodes. I suggest you >>> open a high priority bug and attach the logs. >>> >>> debug osd = 20 >>> debug filestore = 20 >>> debug ms = 1 >>> >>> I'll be out for the holidays, but I should be able to look at it when >>> I get back. >>> -Sam >>> >> >> >> Thanks Sam, >> >> although I am not sure if it makes not only a historical interest (the >> mentioned cluster running cuttlefish), I`ll try to collect logs for >> scrub. Same stuff: https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg15447.html https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg14918.html Looks like issue is still with us, though it requires meta or file structure corruption to show itself. I`ll check if it can be reproduced via rsync -X sec pg subdir -> pri pg subdir or vice-versa. Mine case shows slightly different pathnames for same objects with same checksums, may be a root reason then. As every case mentioned, including mine, happened in oh-shit-hardware-is-broken case, I suggest that the incurable corruption happens during primary backfill from active replica at the recovery time. _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com