Dear all,
We have a Ceph cluster with several nodes, each node contains 4-6 OSDs. We are running the OS off USB drive to maximise the use of the drive bays for the OSDs and so far everything is running fine.
Occasionally, the OS running on the USB drive would fail, and we would normally replace the drive with a pre-configured similar OS and Ceph running, so when the new OS boots up, it will automatically detect all the OSDs and start them. It works fine without any issues.
However, the issue is in recovery. When one node goes down, all the OSDs would be down and recovery will start to move the pg replicas on the affected OSDs to other available OSDs, and cause the Ceph to be degraded, say 5%, which is expected. However, when we boot up the failed node with a new OS, and bring back the OSDs up, more PGs are being scheduled for backfilling and instead of reducing, the degradation level will shoot up again to, for example, 10%, and in some occasion, it goes up to 19%.
We had experience when one node is down, it will degraded to 5% and recovery will start, but when we manage to bring back up the node (still the same OS), the degradation level will reduce to below 1% and eventually recovery will be completed faster.
Why the same behaviour doesn't apply on the above situation? The OSD numbers are the same when the node boots up, the crush map weight values are also the same. Only the hostname is different.
Any advice / suggestions?
Looking forward to your reply, thank you.
Cheers.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com