No recovery when "norebalance" flag set

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi list,

During cluster expansion (adding extra disks to existing hosts) some
OSDs failed (FAILED assert(0 == "unexpected error", _txc_add_transaction
error (39) Directory not empty not handled on operation 21 (op 1,
counting from 0), full details: https://8n1.org/14078/c534). We had
"norebalance", "nobackfill", and "norecover" flags set. After we unset
nobackfill and norecover (to let Ceph fix the degraded PGs) it would
recover all but 12 objects (2 PGs). We queried the PGs and the OSDs that
were supposed to have a copy of them, and they were already "probed".  A
day later (~24 hours) it would still not have recovered the degraded
objects.  After we unset the "norebalance" flag it would start
rebalancing, backfilling and recovering. The 12 degraded objects were
recovered.

Is this expected behaviour? I would expect Ceph to always try to fix
degraded things first and foremost. Even "pg force-recover" and "pg
force-backfill" could not force recovery.

Gr. Stefan




-- 
| BIT BV  http://www.bit.nl/        Kamer van Koophandel 09090351
| GPG: 0xD14839C6                   +31 318 648 688 / info@xxxxxx
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux