2015-11-20 19:38 GMT+08:00 Sage Weil <sage@xxxxxxxxxxxx>: > On Fri, 20 Nov 2015, Wei-Chung Cheng wrote: >> Hi Loic and cephers, >> >> Sure, I have time to help (comment) on this feature replace a disk. >> This is a useful feature to handle disk failure :p >> >> An simple step is described on http://tracker.ceph.com/issues/13732 : >> 1. set noout flag - if the broken osd is primary osd, could we handle well? >> 2. stop osd daemon and we need to wait the osd actually down. (or >> maybe use deactivate option with ceph-disk) >> >> these two above step seems OK. >> about handle crush map, should we remove the broken osd out? >> If we do that, why we set noout flag? It still trigger re-balance >> after we remove osd from crushmap. > > Right--I think you generally want to do either one or the other: > > 1) mark osd out, leave failed disk in place. or, replace with new disk > that re-uses the same osd id. > > or, > > 2) remove osd from crush map. replace with new disk (which gets new osd > id). > > I think re-using the osd id is awkward currently, so doing 1 and replacing > the disk ends up moving data twice. > Hi sage, If the osd on "DNE" status, its weight must be zero and trigger moving object data? In my test cases, I only remove the auth key and osd-id (osd is "DNE" status). Then replace with new disk that re-uses the same osd-id. The osd only has little time on the out status. I think this operation could reduce some redundant data moving. How you think this operation? or just like you say. mark osd out (deactivate/destroy ...etc) and replace with new disk that re-uses the same osd id? btw, if we just use ceph-deploy/ceph-disk, we could not create osd with specific osd-id. that should we implement for it? thanks!!! vicente -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html