Removing an OSD node the right way

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear Cephers,

I had to remove a failed OSD server node, and what i did is the following
1) First marked all OSDs on that (to be removed) server down and out
2) Secondly, let Ceph do backfilling and rebalancing, and wait for completing
3) Now i have full redundancy, so i delete thoses removed OSDs from the cluster, e.g. ceph osd cursh remove osd.${OSD_NUM}
4) To my surprise, after removing those already-out OSDs from the cluster, i was seeing a tons of PG remapped and once again BACKFILLING/REBALANCING

What is major problems of the above procedure, which caused double BACKFILLING/REBALANCING?  The root cause could be on those "already-out" OSDs but "not-yet being-removed" form CRUSH"? I previous thought those "out" OSDs would not impact CRUSH, but it seems i am wrong.

Any suggestions, comments, explanations are highly appreciated,

Best regards,

Samuel



huxiaoyu@xxxxxxxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux