On 13/12/2018 13.38, Jonas Jelten wrote: > > I now tested this on a 5-node 20-osd 3-replica-only cluster. > Easy steps to reproduce seem to be: > > * Have a healthy cluster > * ceph osd set pause # make sure no writes mess up the test > * ceph osd set nobackfill > * ceph osd set norecover # make sure the error is not recovered but instead stays > * ceph tell 'osd.*' injectargs '--debug_osd=20/20' # turn up logging > * ceph osd out $osdid # take out a random osd > * observe the state, now objects are degraded already, check pg query. > In my test, I observe that $osdid was "already probed" but it does have the data, > the cluster was completely healthy before. > * ceph osd down $osdid # repeer this osd, it'll come up again right away > * observe the state again, even more objects are degraded now, check pg query. > In my test, $osdid is now "not queried" > * ceph osd in $osdid # everything turns back to normal and healthy > * ceph tell 'osd.*' injectargs '--debug_osd=1/5' # silence logging again > * ceph osd unset ... # unset the flags > > > In summary: while preventing recovery, an out osd produces degraded objects. An out and repeered OSD produces even more > degraded objects. Taking it in again will discover all missing object copies. > I've posted the level-20 log of an OSD to https://tracker.ceph.com/issues/37439