How to replace an node in ceph?

chibi@xxxxxxx (Christian Balzer) · Fri, 5 Sep 2014 14:19:30 +0900

Hello,

On Fri, 5 Sep 2014 12:09:11 +0800 Ding Dinghua wrote:

> Please see my comment below:
> 
> 
> 2014-09-04 21:33 GMT+08:00 Christian Balzer <chibi at gol.com>:
> 
> >
> > Hello,
> >
> > On Thu, 4 Sep 2014 20:56:31 +0800 Ding Dinghua wrote:
> >
> > Aside from what Loic wrote, why not replace the network controller or
> > if it is onboard, add a card?
> >
> > > Hi all,
> > >         I'm new to ceph, and apologize if the question has been
> > > asked.
> > >
> > >         I have setup a 8-nodes ceph cluster, and after two months
> > > running, network controller of an node is broken, so I have to
> > > replace the node with an new one.
> > >         I don't want to trigger data migration, since all I want to
> > > do is replacing a node, not shrink the cluster and then enlarge the
> > > cluster.
> >
> > Well, you will have (had) data migration unless your cluster was set to
> > noout from the start or had a "mon osd downout subtree limit" set
> > accordingly.
> >
>   [Ding Dinghua]: Yes, I have already set noout flag
> 
> >
> > >         I think the following steps may work:
> > >         1)  set osd_crush_update_on_start to false, so when osd
> > > starts, it won't modify crushmap and trigger data migration.
> > I think the noin flag might do that trick, too.
> >
>  [Ding Dinghua]: I set osd_crush_update_on_start to false, so when the
> osds on the new node start,
>                         /etc/init.d/ceph script won't do "ceph osd crush
> create-or-move", and the osds on new node will still in the old host, so
> no data migration will occur.
> 
> > >           2)  set noout flags to prevent osds been kicked out of
> > > cluster and trigger data migration
> > Probably too late at this point...
> >
> > >           3)  mark all osds on the broken node down(actually, since
> > > network controller is broken, these osds are already down)
> > And not out?
> >
>   [Ding Dinghua]: Yes, since noout flag is set, these osds are [down, in]
> 
So far, so good.

However see below:

> >
> > Regards,
> >
> > Christian
> > >           4)  prepare osd on the new node, and keep osd_num the same
> > > with the osd on the broken node:
> > >                ceph-osd -i [osd_num] --osd-data=path1 --mkfs

I don't think that will work. To recycle OSDs they would have to be removed
(triggering migration) first.
Just adding new OSDs should do the trick, though.

Christian

> > >         5) start osd on the new node, and peering and backfilling
> > > work will be started automaticlly
> > >         6)  wait until 5) complete, and repeat 4) and 5) until all
> > > osds on the broken node been moved to the new node
> > >         I have done some test on my test cluster, and it seemed
> > > works, but I'm not quite sure it's right in theory, so any comments
> > > will be appreciated.
> > >         Thanks.
> > >
> >
> >
> > --
> > Christian Balzer        Network/Systems Engineer
> > chibi at gol.com           Global OnLine Japan/Fusion Communications
> > http://www.gol.com/
> >
> 
> 
> 

-- 
Christian Balzer        Network/Systems Engineer                
chibi at gol.com   	Global OnLine Japan/Fusion Communications
http://www.gol.com/