Hi Christian, for my setup "b" takes too long - too much data movement and stress to all nodes. I have simply (with replica 3) "set noout", reinstall one node (with new filesystem on the OSDs, but leave them in the crushmap) and start all OSDs (at friday night) - takes app. less than one day for rebuild (11*4TB 1*8TB). Do also stress the other nodes, but less than with weigting to zero. Udo On 31.08.2015 06:07, Christian Balzer wrote: > > Hello, > > I'm about to add another storage node to small firefly cluster here and > refurbish 2 existing nodes (more RAM, different OSD disks). > > Insert rant about not going to start using ceph-deploy as I would have to > set the cluster to no-in since "prepare" also activates things due to the > udev magic... > > This cluster is quite at the limits of its IOPS capacity (the HW was > requested ages ago, but the mills here grind slowly and not particular > fine either), so the plan is to: > > a) phase in the new node (lets call it C), one OSD at a time (in the dead > of night) > b) empty out old node A (weight 0), one OSD at a time. When > done, refurbish and bring it back in, like above. > c) repeat with 2nd old node B. > > Looking at this it's obvious where the big optimization in this procedure > would be, having the ability to "freeze" the OSDs on node B. > That is making them ineligible for any new PGs while preserving their > current status. > So that data moves from A to C (which is significantly faster than A or B) > and then back to A when it is refurbished, avoiding any heavy lifting by B. > > Does that sound like something other people might find useful as well and > is it feasible w/o upsetting the CRUSH applecart? > > Christian > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com