Hi list, During cluster expansion (adding extra disks to existing hosts) some OSDs failed (FAILED assert(0 == "unexpected error", _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1, counting from 0), full details: https://8n1.org/14078/c534). We had "norebalance", "nobackfill", and "norecover" flags set. After we unset nobackfill and norecover (to let Ceph fix the degraded PGs) it would recover all but 12 objects (2 PGs). We queried the PGs and the OSDs that were supposed to have a copy of them, and they were already "probed". A day later (~24 hours) it would still not have recovered the degraded objects. After we unset the "norebalance" flag it would start rebalancing, backfilling and recovering. The 12 degraded objects were recovered. Is this expected behaviour? I would expect Ceph to always try to fix degraded things first and foremost. Even "pg force-recover" and "pg force-backfill" could not force recovery. Gr. Stefan -- | BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / info@xxxxxx _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com