Re: OSD replacement feature

Wei-Chung Cheng <freeze.vicente.cheng@xxxxxxxxx> · Fri, 20 Nov 2015 15:55:31 +0800

Hi Loic and cephers,

Sure, I have time to help (comment) on this feature replace a disk.
This is a useful feature to handle disk failure :p

An simple step is described on http://tracker.ceph.com/issues/13732 :
1. set noout flag - if the broken osd is primary osd, could we handle well?
2. stop osd daemon and we need to wait the osd actually down. (or
maybe use deactivate option with ceph-disk)

these two above step seems OK.
about handle crush map, should we remove the broken osd out?
If we do that, why we set noout flag? It still trigger re-balance
after we remove osd from crushmap.

Could we just remove the auth key and re-create osd with new disk
(then add the auth key back)?

I will try and test myself.

feel free to let me know if you have any suggeations!

thanks!!!
vicente

2015-11-20 1:20 GMT+08:00 Loic Dachary <loic@xxxxxxxxxxx>:
> Hi Vicente,
>
> Now that your ceph-disk deactivate/destroy feature is merged (and documented ;-), I wonder if you have time to comment on http://tracker.ceph.com/issues/13732 which is about replacing a disk ? Your input would be much appreciated.
>
> Cheers
>
> --
> Loïc Dachary, Artisan Logiciel Libre
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html