Haven't seen that exact issue. One thing to note though is that if osd_max_backfills is set to 1, then it can happen that PGs get into backfill state, taking that single reservation on a given OSD, and therefore the recovery_wait PGs can't get a slot. I suppose that backfill prioritization is supposed to prevent this, but in my experience luminous v12.2.8 doesn't always get it right. So next time I'd try injecting osd_max_backfills = 2 or 3 to kickstart the recovering PGs. -- dan On Sun, Nov 25, 2018 at 8:41 PM Stefan Kooman <stefan@xxxxxx> wrote: > > Hi list, > > During cluster expansion (adding extra disks to existing hosts) some > OSDs failed (FAILED assert(0 == "unexpected error", _txc_add_transaction > error (39) Directory not empty not handled on operation 21 (op 1, > counting from 0), full details: https://8n1.org/14078/c534). We had > "norebalance", "nobackfill", and "norecover" flags set. After we unset > nobackfill and norecover (to let Ceph fix the degraded PGs) it would > recover all but 12 objects (2 PGs). We queried the PGs and the OSDs that > were supposed to have a copy of them, and they were already "probed". A > day later (~24 hours) it would still not have recovered the degraded > objects. After we unset the "norebalance" flag it would start > rebalancing, backfilling and recovering. The 12 degraded objects were > recovered. > > Is this expected behaviour? I would expect Ceph to always try to fix > degraded things first and foremost. Even "pg force-recover" and "pg > force-backfill" could not force recovery. > > Gr. Stefan > > > > > -- > | BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351 > | GPG: 0xD14839C6 +31 318 648 688 / info@xxxxxx > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com