Dear All, Following a clunky* cluster restart, we had 23 "objects unfound" 14 pg recovery_unfound We could see no way to recover the unfound objects, we decided to mark the objects in one pg unfound... [root@ceph1 bad_oid]# ceph pg 5.f2f mark_unfound_lost delete pg has 2 objects unfound and apparently lost marking Unfortunately, this immediately crashed the primary OSD for this PG: OSD log showing the osd crashing 3 times here: <http://p.ip.fi/gV8r> the assert was :> 2020-02-10 13:38:45.003 7fa713ef3700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.6/rpm/el7/BUILD/ceph-14.2.6/src/osd/PrimaryLogPG.cc: In function 'int PrimaryLogPG::recover_missing(const hobject_t&, eversion_t, int, PGBackend::RecoveryHandle*)' thread 7fa713ef3700 time 2020-02-10 13:38:45.000875 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.6/rpm/el7/BUILD/ceph-14.2.6/src/osd/PrimaryLogPG.cc: 11550: FAILED ceph_assert(head_obc) Questions.. 1) Is it possible to recover the flapping OSD? or should we fail out the flapping OSD and hope the cluster recovers? 2) We have 13 other pg with unfound objects. Do we need to mark_unfound these one at a time, and then fail out their primary OSD? (allowing the cluster to recover before mark_unfound the next pg & failing it's primary OSD) * thread describing the bad restart :> <https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/IRKCDRRAH7YZEVXN5CH4JT2NH4EWYRGI/#IRKCDRRAH7YZEVXN5CH4JT2NH4EWYRGI> many thanks! Jake -- Dr Jake Grimmett MRC Laboratory of Molecular Biology Francis Crick Avenue, Cambridge CB2 0QH, UK. -- Dr Jake Grimmett Head Of Scientific Computing MRC Laboratory of Molecular Biology Francis Crick Avenue, Cambridge CB2 0QH, UK. Phone 01223 267019 Mobile 0776 9886539 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx