On Tue, 27 Jan 2015 01:37:52 +0000 Quenten Grasso wrote: > Hi Christian, > > As you'll probably notice we have 11,22,33,44 marked as out as well. but > here's our tree. > > all of the OSD's in question had already been rebalanced/emptied from > the hosts. osd.0 existed on pbnerbd01 > Ah, lemme re-phrase that then, I was assuming a simpler scenario. Same reasoning, by removing the ODS the weight (not reweight) of the host changed (from 11 to 10) and that then triggered the re-balancing. Clear as mud? ^.^ Christian > > # ceph osd tree > # id weight type name up/down reweight > -1 54 root default > -3 54 rack unknownrack > -2 10 host pbnerbd01 > 1 1 osd.1 up 1 > 10 1 osd.10 up 1 > 2 1 osd.2 up 1 > 3 1 osd.3 up 1 > 4 1 osd.4 up 1 > 5 1 osd.5 up 1 > 6 1 osd.6 up 1 > 7 1 osd.7 up 1 > 8 1 osd.8 up 1 > 9 1 osd.9 up 1 > -4 11 host pbnerbd02 > 11 1 osd.11 up 0 > 12 1 osd.12 up 1 > 13 1 osd.13 up 1 > 14 1 osd.14 up 1 > 15 1 osd.15 up 1 > 16 1 osd.16 up 1 > 17 1 osd.17 up 1 > 18 1 osd.18 up 1 > 19 1 osd.19 up 1 > 20 1 osd.20 up 1 > 21 1 osd.21 up 1 > -5 11 host pbnerbd03 > 22 1 osd.22 up 0 > 23 1 osd.23 up 1 > 24 1 osd.24 up 1 > 25 1 osd.25 up 1 > 26 1 osd.26 up 1 > 27 1 osd.27 up 1 > 28 1 osd.28 up 1 > 29 1 osd.29 up 1 > 30 1 osd.30 up 1 > 31 1 osd.31 up 1 > 32 1 osd.32 up 1 > -6 11 host pbnerbd04 > 33 1 osd.33 up 0 > 34 1 osd.34 up 1 > 35 1 osd.35 up 1 > 36 1 osd.36 up 1 > 37 1 osd.37 up 1 > 38 1 osd.38 up 1 > 39 1 osd.39 up 1 > 40 1 osd.40 up 1 > 41 1 osd.41 up 1 > 42 1 osd.42 up 1 > 43 1 osd.43 up 1 > -7 11 host pbnerbd05 > 44 1 osd.44 up 0 > 45 1 osd.45 up 1 > 46 1 osd.46 up 1 > 47 1 osd.47 up 1 > 48 1 osd.48 up 1 > 49 1 osd.49 up 1 > 50 1 osd.50 up 1 > 51 1 osd.51 up 1 > 52 1 osd.52 up 1 > 53 1 osd.53 up 1 > 54 1 osd.54 up 1 > > Regards, > Quenten Grasso > > -----Original Message----- > From: Christian Balzer [mailto:chibi@xxxxxxx] > Sent: Tuesday, 27 January 2015 11:33 AM > To: ceph-users@xxxxxxxxxxxxxx > Cc: Quenten Grasso > Subject: Re: OSD removal rebalancing again > > > Hello, > > A "ceph -s" and "ceph osd tree" would have been nice, but my guess is > that osd.0 was the only osd on that particular storage server? > > In that case the removal of the bucket (host) by removing the last OSD > in it also triggered a re-balancing. Not really/well documented AFAIK > and annoying, but OTOH both expected (from a CRUSH perspective) and > harmless. > > Christian > > On Tue, 27 Jan 2015 01:21:28 +0000 Quenten Grasso wrote: > > > Hi All, > > > > I just removed an OSD from our cluster following the steps on > > http://ceph.com/docs/master/rados/operations/add-or-rm-osds/ > > > > First I set the OSD as out, > > > > ceph osd out osd.0 > > > > This emptied the OSD and eventually health of the cluster came back to > > normal/ok. and OSD was up and out. (took about 2-3 hours) (OSD.0 used > > space before setting as OUT was 900~ GB after rebalance took place OSD > > Usage was ~150MB) > > > > Once this was all ok I then proceeded to STOP the OSD. > > > > service ceph stop osd.0 > > > > checked cluster health and all looked ok, then I decided to remove the > > osd using the following commands. > > > > ceph osd crush remove osd.0 > > ceph auth del osd.0 > > ceph osd rm 0 > > > > > > Now our cluster says > > health HEALTH_WARN 414 pgs backfill; 12 pgs backfilling; 19 pgs > > recovering; 344 pgs recovery_wait; 789 pgs stuck unclean; recovery > > 390967/10986568 objects degraded (3.559%) > > > > before using the removal procedure everything was "ok" and the osd.0 > > had been emptied and seemingly rebalanced. > > > > Any ideas why its rebalancing again? > > > > we're using Ubuntu 12.04 w/ Ceph 80.8 & Kernel 3.13.0-43-generic > > #72~precise1-Ubuntu SMP Tue Dec 9 12:14:18 UTC 2014 x86_64 x86_64 > > x86_64 GNU/Linux > > > > > > > > Regards, > > Quenten Grasso > > -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Fusion Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com