Thanks.I try this method just like ceph document say. But I just test osd.6 in this way,and the leveldb of osd.6 is broken.so it can not start. When I try this for other osd,it works. 2016-03-29 8:22 GMT+08:00 Christian Balzer <chibi@xxxxxxx>: > On Mon, 28 Mar 2016 18:36:14 +0800 lin zhou wrote: > >> > Hello, >> > >> > On Sun, 27 Mar 2016 13:41:57 +0800 lin zhou wrote: >> > >> > > Hi,guys. >> > > some days ago,one osd have a large latency seeing in ceph osd >> > > perf.and this device make this node a high cpu await. >> > The thing to do at that point would have been look at things with atop >> > or iostat to verify that it was the device itself that was slow and not >> > because it was genuinely busy due to uneven activity maybe. >> > As well as a quick glance at SMART of course. >> >> Thanks.I will follow this when I face this problem next time. >> >> > > So,I delete this osd ad then check this device. >> > If that device (HDD, SSD, which model?) slowed down your cluster, you >> > should not have deleted it. >> > The best method would have been to set your cluster to noout and stop >> > that specific OSD. >> > >> > When you say "delete", what exact steps did you take? >> > Did this include removing it from the crush map? >> >> Yes,I delete it from crush map.delete its auth,and rm osd. >> > > Google is your friend, if you deleted it like in the link below you should > be be able to re-add it the same way: > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-June/002345.html > > Christian > >> > > But nothing error found. >> > > >> > > And now I want to re-add this device into cluster with it's data. >> > > >> > All the data was already replicated elsewhere if you deleted/removed >> > the OSD, you're likely not going to save much if any data movement by >> > re-adding it. >> >> Yes,the cluster finished rebalance.but I face a problem of one unfound >> object. And in the output of pg query in recovery_state say,this osd is >> down,but other odds are ok. >> So I want to recover this osd to recover this unfound object. >> >> and mark_unfound_lost revert/delete do not work: >> Error EINVAL: pg has 1 unfound objects but we haven't probed all sources, >> >> detail see: >> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-March/008452.html >> >> Thanks again. >> >> > > >> > > I try to using ceph-osd to add it,but it can not start.log are paste >> > > in : https://gist.github.com/hnuzhoulin/836f9e633b90041e89ad >> > > >> > > so what's the recommend steps. >> > That depends on how you deleted it, but at this point your data is >> > likely to be mostly stale anyway, so I'd start from scratch. >> >> > Christian >> > -- >> > Christian Balzer Network/Systems Engineer >> > chibi@xxxxxxx Global OnLine Japan/Rakuten Communications >> > http://www.gol.com/ >> > >> > > > -- > Christian Balzer Network/Systems Engineer > chibi@xxxxxxx Global OnLine Japan/Rakuten Communications > http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com