Hello, On Mon, 31 Aug 2015 22:44:05 +0000 Stillwell, Bryan wrote: > We have the following in our ceph.conf to bring in new OSDs with a weight > of 0: > > [osd] > osd_crush_initial_weight = 0 > > > We then set 'nobackfill' and bring in each OSD at full weight one at a > time (letting things settle down before bring in the next OSD). Once all > the OSDs are brought in we unset 'nobackfill' and let ceph take care of > the rest. This seems to work pretty well for us. > That looks interesting, will give it a spin on my test cluster. One thing the "letting things settle down" reminded me of is that adding OSDs and especially a new node will cause (potentially significant) data movement resulting from CRUSH map changes, something to keep in mind when scheduling even those "harmless" first steps. Christian > Bryan > > On 8/31/15, 4:08 PM, "ceph-users on behalf of Wang, Warren" > <ceph-users-bounces@xxxxxxxxxxxxxx on behalf of > Warren_Wang@xxxxxxxxxxxxxxxxx> wrote: > > >When we know we need to off a node, we weight it down over time. > >Depending on your cluster, you may need to do this over days or hours. > > > >In theory, you could do the same when putting OSDs in, by setting noin, > >and then setting weight to something very low, and going up over time. I > >haven¹t tried this though. > > > >-- > >Warren Wang > >Comcast Cloud (OpenStack) > > > > > > > >On 8/31/15, 2:57 AM, "ceph-users on behalf of Udo Lembke" > ><ceph-users-bounces@xxxxxxxxxxxxxx on behalf of ulembke@xxxxxxxxxxxx> > >wrote: > > > >>Hi Christian, > >>for my setup "b" takes too long - too much data movement and stress to > >>all nodes. > >>I have simply (with replica 3) "set noout", reinstall one node (with > >>new filesystem on the OSDs, but leave them in the > >>crushmap) and start all OSDs (at friday night) - takes app. less than > >>one day for rebuild (11*4TB 1*8TB). > >>Do also stress the other nodes, but less than with weigting to zero. > >> > >>Udo > >> > >>On 31.08.2015 06:07, Christian Balzer wrote: > >>> > >>> Hello, > >>> > >>> I'm about to add another storage node to small firefly cluster here > >>> and refurbish 2 existing nodes (more RAM, different OSD disks). > >>> > >>> Insert rant about not going to start using ceph-deploy as I would > >>> have > >>>to > >>> set the cluster to no-in since "prepare" also activates things due to > >>>the > >>> udev magic... > >>> > >>> This cluster is quite at the limits of its IOPS capacity (the HW was > >>> requested ages ago, but the mills here grind slowly and not > >>> particular fine either), so the plan is to: > >>> > >>> a) phase in the new node (lets call it C), one OSD at a time (in the > >>>dead > >>> of night) > >>> b) empty out old node A (weight 0), one OSD at a time. When > >>> done, refurbish and bring it back in, like above. > >>> c) repeat with 2nd old node B. > >>> > >>> Looking at this it's obvious where the big optimization in this > >>>procedure > >>> would be, having the ability to "freeze" the OSDs on node B. > >>> That is making them ineligible for any new PGs while preserving their > >>> current status. > >>> So that data moves from A to C (which is significantly faster than A > >>> or > >>>B) > >>> and then back to A when it is refurbished, avoiding any heavy lifting > >>>by B. > >>> > >>> Does that sound like something other people might find useful as well > >>>and > >>> is it feasible w/o upsetting the CRUSH applecart? > >>> > >>> Christian > >>> > >> > >>_______________________________________________ > >>ceph-users mailing list > >>ceph-users@xxxxxxxxxxxxxx > >>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > > >_______________________________________________ > >ceph-users mailing list > >ceph-users@xxxxxxxxxxxxxx > >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > ________________________________ > > This E-mail and any of its attachments may contain Time Warner Cable > proprietary information, which is privileged, confidential, or subject > to copyright belonging to Time Warner Cable. This E-mail is intended > solely for the use of the individual or entity to which it is addressed. > If you are not the intended recipient of this E-mail, you are hereby > notified that any dissemination, distribution, copying, or action taken > in relation to the contents of and attachments to this E-mail is > strictly prohibited and may be unlawful. If you have received this > E-mail in error, please notify the sender immediately and permanently > delete the original and any copy of this E-mail and any printout. -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Fusion Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com