Re: OSD removal rebalancing again

Quenten Grasso <qgrasso@xxxxxxxxxx> · Tue, 27 Jan 2015 01:37:52 +0000

Hi Christian,

As you'll probably notice we have 11,22,33,44 marked as out as well. but here's our tree.

all of the OSD's in question had already been rebalanced/emptied from the hosts. osd.0 existed on pbnerbd01

# ceph osd tree
# id    weight  type name       up/down reweight
-1      54      root default
-3      54              rack unknownrack
-2      10                      host pbnerbd01
1       1                               osd.1   up      1
10      1                               osd.10  up      1
2       1                               osd.2   up      1
3       1                               osd.3   up      1
4       1                               osd.4   up      1
5       1                               osd.5   up      1
6       1                               osd.6   up      1
7       1                               osd.7   up      1
8       1                               osd.8   up      1
9       1                               osd.9   up      1
-4      11                      host pbnerbd02
11      1                               osd.11  up      0
12      1                               osd.12  up      1
13      1                               osd.13  up      1
14      1                               osd.14  up      1
15      1                               osd.15  up      1
16      1                               osd.16  up      1
17      1                               osd.17  up      1
18      1                               osd.18  up      1
19      1                               osd.19  up      1
20      1                               osd.20  up      1
21      1                               osd.21  up      1
-5      11                      host pbnerbd03
22      1                               osd.22  up      0
23      1                               osd.23  up      1
24      1                               osd.24  up      1
25      1                               osd.25  up      1
26      1                               osd.26  up      1
27      1                               osd.27  up      1
28      1                               osd.28  up      1
29      1                               osd.29  up      1
30      1                               osd.30  up      1
31      1                               osd.31  up      1
32      1                               osd.32  up      1
-6      11                      host pbnerbd04
33      1                               osd.33  up      0
34      1                               osd.34  up      1
35      1                               osd.35  up      1
36      1                               osd.36  up      1
37      1                               osd.37  up      1
38      1                               osd.38  up      1
39      1                               osd.39  up      1
40      1                               osd.40  up      1
41      1                               osd.41  up      1
42      1                               osd.42  up      1
43      1                               osd.43  up      1
-7      11                      host pbnerbd05
44      1                               osd.44  up      0
45      1                               osd.45  up      1
46      1                               osd.46  up      1
47      1                               osd.47  up      1
48      1                               osd.48  up      1
49      1                               osd.49  up      1
50      1                               osd.50  up      1
51      1                               osd.51  up      1
52      1                               osd.52  up      1
53      1                               osd.53  up      1
54      1                               osd.54  up      1

Regards,
Quenten Grasso

-----Original Message-----
From: Christian Balzer [mailto:chibi@xxxxxxx] 
Sent: Tuesday, 27 January 2015 11:33 AM
To: ceph-users@xxxxxxxxxxxxxx
Cc: Quenten Grasso
Subject: Re:  OSD removal rebalancing again

Hello,

A "ceph -s" and "ceph osd tree" would have been nice, but my guess is that
osd.0 was the only osd on that particular storage server?

In that case the removal of the bucket (host) by removing the last OSD in it also triggered a re-balancing.
Not really/well documented AFAIK and annoying, but OTOH both expected (from a CRUSH perspective) and harmless.

Christian

On Tue, 27 Jan 2015 01:21:28 +0000 Quenten Grasso wrote:

> Hi All,
> 
> I just removed an OSD from our cluster following the steps on 
> http://ceph.com/docs/master/rados/operations/add-or-rm-osds/
> 
> First I set the OSD as out,
> 
> ceph osd out osd.0
> 
> This emptied the OSD and eventually health of the cluster came back to 
> normal/ok. and OSD was up and out. (took about 2-3 hours) (OSD.0 used 
> space before setting as OUT was 900~ GB after rebalance took place OSD 
> Usage was ~150MB)
> 
> Once this was all ok I then proceeded to STOP the OSD.
> 
> service ceph stop osd.0
> 
> checked cluster health and all looked ok, then I decided to remove the 
> osd using the following commands.
> 
> ceph osd crush remove osd.0
> ceph auth del osd.0
> ceph osd rm 0
> 
> 
> Now our cluster says
> health HEALTH_WARN 414 pgs backfill; 12 pgs backfilling; 19 pgs 
> recovering; 344 pgs recovery_wait; 789 pgs stuck unclean; recovery
> 390967/10986568 objects degraded (3.559%)
> 
> before using the removal procedure everything was "ok" and the osd.0 
> had been emptied and seemingly rebalanced.
> 
> Any ideas why its rebalancing again?
> 
> we're using Ubuntu 12.04 w/ Ceph 80.8 & Kernel 3.13.0-43-generic 
> #72~precise1-Ubuntu SMP Tue Dec 9 12:14:18 UTC 2014 x86_64 x86_64 
> x86_64 GNU/Linux
> 
> 
> 
> Regards,
> Quenten Grasso

-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Fusion Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com