On 06/12/2018 19.25, Gregory Farnum wrote: >> So, overall I suspect there is a bug which prevents remapped pg data to be discovered. The PG already knows which OSD is >> the correct candidate, but does not query it. >> >> >> I can try fixing this myself, but I'd need some hints from the developers to relevant code parts. >> >> The OSD is stored correctly in pg->might_have_unfound, and I think it should be queried in PG::discover_all_missing, but >> I'm lost there. I'd appreciate any help tracking this down. > > Do you have logging indicating that this particular function is where > it goes wrong, or did you find it by inspection? > Since it sounds like this is pretty reproducible, I would try doing > that with "debug osd = 20" set, and read through the primary's log > very carefully while it makes these decisions. > -Greg > I found that function by inspection of the sources and trying to figure out where the status displayed in pg query might emerge. I'll see if I can set up a test cluster and reproduce it there, I'd rather not put the production cluster under more load then necessary once again :) -- Jonas