Re: Help Ceph Cluster Down

Arun POONIA <arun.poonia@xxxxxxxxxxxxxxxxx> · Fri, 4 Jan 2019 11:47:12 -0800

Hi Kevin, 
Can I remove newly added server from Cluster and see if it heals cluster ? 

When I check Hard Disk Iops on new server which are very low compared to existing cluster server. 

Indeed this is a critical cluster but I don't have expertise to make it flawless. 

Thanks
Arun

On Fri, Jan 4, 2019 at 11:35 AM Kevin Olbrich <ko@xxxxxxx> wrote:
If you realy created and destroyed OSDs before the cluster healed

itself, this data will be permanently lost (not found / inactive).

Also your PG count is so much oversized, the calculation for peering

will most likely break because this was never tested.

If this is a critical cluster, I would start a new one and bring back

the backups (using a better PG count).

Kevin

Am Fr., 4. Jan. 2019 um 20:25 Uhr schrieb Arun POONIA

<arun.poonia@xxxxxxxxxxxxxxxxx>:

>

> Can anyone comment on this issue please, I can't seem to bring my cluster healthy.

>

> On Fri, Jan 4, 2019 at 6:26 AM Arun POONIA <arun.poonia@xxxxxxxxxxxxxxxxx> wrote:

>>

>> Hi Caspar,

>>

>> Number of IOPs are also quite low. It used be around 1K Plus on one of Pool (VMs) now its like close to 10-30 .

>>

>> Thansk

>> Arun

>>

>> On Fri, Jan 4, 2019 at 5:41 AM Arun POONIA <arun.poonia@xxxxxxxxxxxxxxxxx> wrote:

>>>

>>> Hi Caspar,

>>>

>>> Yes and No, numbers are going up and down. If I run ceph -s command I can see it decreases one time and later it increases again. I see there are so many blocked/slow requests. Almost all the OSDs have slow requests. Around 12% PGs are inactive not sure how to activate them again.

>>>

>>>

>>> [root@fre101 ~]# ceph health detail

>>> 2019-01-04 05:39:23.860142 7fc37a3a0700 -1 asok(0x7fc3740017a0) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph-guests/ceph-client.admin.1066526.140477441513808.asok': (2) No such file or directory

>>> HEALTH_ERR 1 osds down; 3 pools have many more objects per pg than average; 472812/12392654 objects misplaced (3.815%); 3610 PGs pending on creation; Reduced data availability: 6578 pgs inactive, 1882 pgs down, 86 pgs peering, 850 pgs stale; Degraded data redundancy: 216694/12392654 objects degraded (1.749%), 866 pgs degraded, 16 pgs undersized; 116082 slow requests are blocked > 32 sec; 551 stuck requests are blocked > 4096 sec; too many PGs per OSD (2709 > max 200)

>>> OSD_DOWN 1 osds down

>>>     osd.28 (root=default,host=fre119) is down

>>> MANY_OBJECTS_PER_PG 3 pools have many more objects per pg than average

>>>     pool glance-images objects per pg (10478) is more than 92.7257 times cluster average (113)

>>>     pool vms objects per pg (4717) is more than 41.7434 times cluster average (113)

>>>     pool volumes objects per pg (1220) is more than 10.7965 times cluster average (113)

>>> OBJECT_MISPLACED 472812/12392654 objects misplaced (3.815%)

>>> PENDING_CREATING_PGS 3610 PGs pending on creation

>>>     osds [osd.0,osd.1,osd.10,osd.11,osd.14,osd.15,osd.17,osd.18,osd.19,osd.20,osd.21,osd.22,osd.23,osd.25,osd.26,osd.27,osd.28,osd.3,osd.30,osd.32,osd.33,osd.35,osd.36,osd.37,osd.38,osd.4,osd.5,osd.6,osd.7,osd.9] have pending PGs.

>>> PG_AVAILABILITY Reduced data availability: 6578 pgs inactive, 1882 pgs down, 86 pgs peering, 850 pgs stale

>>>     pg 10.900 is down, acting [18]

>>>     pg 10.90e is stuck inactive for 60266.030164, current state activating, last acting [2,38]

>>>     pg 10.913 is stuck stale for 1887.552862, current state stale+down, last acting [9]

>>>     pg 10.915 is stuck inactive for 60266.215231, current state activating, last acting [30,38]

>>>     pg 11.903 is stuck inactive for 59294.465961, current state activating, last acting [11,38]

>>>     pg 11.910 is down, acting [21]

>>>     pg 11.919 is down, acting [25]

>>>     pg 12.902 is stuck inactive for 57118.544590, current state activating, last acting [36,14]

>>>     pg 13.8f8 is stuck inactive for 60707.167787, current state activating, last acting [29,37]

>>>     pg 13.901 is stuck stale for 60226.543289, current state stale+active+clean, last acting [1,31]

>>>     pg 13.905 is stuck inactive for 60266.050940, current state activating, last acting [2,36]

>>>     pg 13.909 is stuck inactive for 60707.160714, current state activating, last acting [34,36]

>>>     pg 13.90e is stuck inactive for 60707.410749, current state activating, last acting [21,36]

>>>     pg 13.911 is down, acting [25]

>>>     pg 13.914 is stale+down, acting [29]

>>>     pg 13.917 is stuck stale for 580.224688, current state stale+down, last acting [16]

>>>     pg 14.901 is stuck inactive for 60266.037762, current state activating+degraded, last acting [22,37]

>>>     pg 14.90f is stuck inactive for 60296.996447, current state activating, last acting [30,36]

>>>     pg 14.910 is stuck inactive for 60266.077310, current state activating+degraded, last acting [17,37]

>>>     pg 14.915 is stuck inactive for 60266.032445, current state activating, last acting [34,36]

>>>     pg 15.8fa is stuck stale for 560.223249, current state stale+down, last acting [8]

>>>     pg 15.90c is stuck inactive for 59294.402388, current state activating, last acting [29,38]

>>>     pg 15.90d is stuck inactive for 60266.176492, current state activating, last acting [5,36]

>>>     pg 15.915 is down, acting [0]

>>>     pg 15.917 is stuck inactive for 56279.658951, current state activating, last acting [13,38]

>>>     pg 15.91c is stuck stale for 374.590704, current state stale+down, last acting [12]

>>>     pg 16.903 is stuck inactive for 56580.905961, current state activating, last acting [25,37]

>>>     pg 16.90e is stuck inactive for 60266.271680, current state activating, last acting [14,37]

>>>     pg 16.919 is stuck inactive for 59901.802184, current state activating, last acting [20,37]

>>>     pg 16.91e is stuck inactive for 60297.038159, current state activating, last acting [22,37]

>>>     pg 17.8e5 is stuck inactive for 60266.149061, current state activating, last acting [25,36]

>>>     pg 17.910 is stuck inactive for 59901.850204, current state activating, last acting [26,37]

>>>     pg 17.913 is stuck inactive for 60707.208364, current state activating, last acting [13,36]

>>>     pg 17.91a is stuck inactive for 60266.187509, current state activating, last acting [4,37]

>>>     pg 17.91f is down, acting [6]

>>>     pg 18.908 is stuck inactive for 60707.216314, current state activating, last acting [10,36]

>>>     pg 18.911 is stuck stale for 244.570413, current state stale+down, last acting [34]

>>>     pg 18.919 is stuck inactive for 60265.980816, current state activating, last acting [28,36]

>>>     pg 18.91a is stuck inactive for 59901.814714, current state activating, last acting [28,37]

>>>     pg 18.91e is stuck inactive for 60707.179338, current state activating, last acting [0,36]

>>>     pg 19.90a is stuck inactive for 60203.089988, current state activating, last acting [35,38]

>>>     pg 20.8e0 is stuck inactive for 60296.839098, current state activating+degraded, last acting [18,37]

>>>     pg 20.913 is stuck inactive for 60296.977401, current state activating+degraded, last acting [11,37]

>>>     pg 20.91d is stuck inactive for 60296.891370, current state activating+degraded, last acting [10,38]

>>>     pg 21.8e1 is stuck inactive for 60707.422330, current state activating, last acting [21,38]

>>>     pg 21.907 is stuck inactive for 60296.855511, current state activating, last acting [20,36]

>>>     pg 21.90e is stuck inactive for 60266.055557, current state activating, last acting [1,38]

>>>     pg 21.917 is stuck inactive for 60296.940074, current state activating, last acting [15,36]

>>>     pg 22.90b is stuck inactive for 60707.286070, current state activating, last acting [20,36]

>>>     pg 22.90c is stuck inactive for 59901.788199, current state activating, last acting [20,37]

>>>     pg 22.90f is stuck inactive for 60297.062020, current state activating, last acting [38,35]

>>> PG_DEGRADED Degraded data redundancy: 216694/12392654 objects degraded (1.749%), 866 pgs degraded, 16 pgs undersized

>>>     pg 12.85a is active+undersized+degraded, acting [3]

>>>     pg 14.843 is activating+degraded, acting [7,38]

>>>     pg 14.85f is activating+degraded, acting [25,36]

>>>     pg 14.865 is activating+degraded, acting [33,37]

>>>     pg 14.87a is activating+degraded, acting [28,36]

>>>     pg 14.87e is activating+degraded, acting [17,38]

>>>     pg 14.882 is activating+degraded, acting [4,36]

>>>     pg 14.88a is activating+degraded, acting [2,37]

>>>     pg 14.893 is activating+degraded, acting [24,36]

>>>     pg 14.897 is active+undersized+degraded, acting [34]

>>>     pg 14.89c is activating+degraded, acting [14,38]

>>>     pg 14.89e is activating+degraded, acting [15,38]

>>>     pg 14.8a8 is active+undersized+degraded, acting [33]

>>>     pg 14.8b1 is activating+degraded, acting [30,38]

>>>     pg 14.8d4 is active+undersized+degraded, acting [13]

>>>     pg 14.8d8 is active+undersized+degraded, acting [4]

>>>     pg 14.8e6 is active+undersized+degraded, acting [10]

>>>     pg 14.8e7 is active+undersized+degraded, acting [1]

>>>     pg 14.8ef is activating+degraded, acting [9,36]

>>>     pg 14.8f8 is active+undersized+degraded, acting [30]

>>>     pg 14.901 is activating+degraded, acting [22,37]

>>>     pg 14.910 is activating+degraded, acting [17,37]

>>>     pg 14.913 is active+undersized+degraded, acting [18]

>>>     pg 20.821 is activating+degraded, acting [37,33]

>>>     pg 20.825 is activating+degraded, acting [25,36]

>>>     pg 20.84f is active+undersized+degraded, acting [2]

>>>     pg 20.85a is active+undersized+degraded, acting [11]

>>>     pg 20.85f is activating+degraded, acting [1,38]

>>>     pg 20.865 is activating+degraded, acting [8,38]

>>>     pg 20.869 is activating+degraded, acting [27,37]

>>>     pg 20.87b is active+undersized+degraded, acting [30]

>>>     pg 20.88b is activating+degraded, acting [6,38]

>>>     pg 20.895 is activating+degraded, acting [37,27]

>>>     pg 20.89c is activating+degraded, acting [1,36]

>>>     pg 20.8a3 is activating+degraded, acting [30,36]

>>>     pg 20.8ad is activating+degraded, acting [1,38]

>>>     pg 20.8af is activating+degraded, acting [33,37]

>>>     pg 20.8b7 is activating+degraded, acting [0,38]

>>>     pg 20.8b9 is activating+degraded, acting [20,38]

>>>     pg 20.8d4 is activating+degraded, acting [28,37]

>>>     pg 20.8d5 is activating+degraded, acting [24,37]

>>>     pg 20.8e0 is activating+degraded, acting [18,37]

>>>     pg 20.8e3 is activating+degraded, acting [21,38]

>>>     pg 20.8ea is activating+degraded, acting [17,36]

>>>     pg 20.8ee is active+undersized+degraded, acting [4]

>>>     pg 20.8f2 is activating+degraded, acting [3,36]

>>>     pg 20.8fb is activating+degraded, acting [10,38]

>>>     pg 20.8fc is activating+degraded, acting [20,38]

>>>     pg 20.913 is activating+degraded, acting [11,37]

>>>     pg 20.916 is active+undersized+degraded, acting [21]

>>>     pg 20.91d is activating+degraded, acting [10,38]

>>> REQUEST_SLOW 116082 slow requests are blocked > 32 sec

>>>     10619 ops are blocked > 2097.15 sec

>>>     74227 ops are blocked > 1048.58 sec

>>>     18561 ops are blocked > 524.288 sec

>>>     10862 ops are blocked > 262.144 sec

>>>     1037 ops are blocked > 131.072 sec

>>>     520 ops are blocked > 65.536 sec

>>>     256 ops are blocked > 32.768 sec

>>>     osd.29 has blocked requests > 32.768 sec

>>>     osd.15 has blocked requests > 262.144 sec

>>>     osds 12,13,31 have blocked requests > 524.288 sec

>>>     osds 1,8,16,19,23,25,26,33,37,38 have blocked requests > 1048.58 sec

>>>     osds 3,4,5,6,10,14,17,22,27,30,32,35,36 have blocked requests > 2097.15 sec

>>> REQUEST_STUCK 551 stuck requests are blocked > 4096 sec

>>>     551 ops are blocked > 4194.3 sec

>>>     osds 0,28 have stuck requests > 4194.3 sec

>>> TOO_MANY_PGS too many PGs per OSD (2709 > max 200)

>>> [root@fre101 ~]#

>>> [root@fre101 ~]#

>>> [root@fre101 ~]#

>>> [root@fre101 ~]#

>>> [root@fre101 ~]#

>>> [root@fre101 ~]#

>>> [root@fre101 ~]# ceph -s

>>> 2019-01-04 05:39:29.364100 7f0fb32f2700 -1 asok(0x7f0fac0017a0) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph-guests/ceph-client.admin.1066635.139705286924624.asok': (2) No such file or directory

>>>   cluster:

>>>     id:     adb9ad8e-f458-4124-bf58-7963a8d1391f

>>>     health: HEALTH_ERR

>>>             3 pools have many more objects per pg than average

>>>             473825/12392654 objects misplaced (3.823%)

>>>             3723 PGs pending on creation

>>>             Reduced data availability: 6677 pgs inactive, 1948 pgs down, 157 pgs peering, 850 pgs stale

>>>             Degraded data redundancy: 306567/12392654 objects degraded (2.474%), 949 pgs degraded, 16 pgs undersized

>>>             98047 slow requests are blocked > 32 sec

>>>             33 stuck requests are blocked > 4096 sec

>>>             too many PGs per OSD (2690 > max 200)

>>>

>>>   services:

>>>     mon: 3 daemons, quorum ceph-mon01,ceph-mon02,ceph-mon03

>>>     mgr: ceph-mon03(active), standbys: ceph-mon01, ceph-mon02

>>>     osd: 39 osds: 39 up, 39 in; 76 remapped pgs

>>>     rgw: 1 daemon active

>>>

>>>   data:

>>>     pools:   18 pools, 54656 pgs

>>>     objects: 6051k objects, 10944 GB

>>>     usage:   21934 GB used, 50687 GB / 72622 GB avail

>>>     pgs:     13.267% pgs not active

>>>              306567/12392654 objects degraded (2.474%)

>>>              473825/12392654 objects misplaced (3.823%)

>>>              44937 active+clean

>>>              3850  activating

>>>              1936  active+undersized

>>>              1078  down

>>>              864   stale+down

>>>              597   peering

>>>              591   activating+degraded

>>>              316   active+undersized+degraded

>>>              205   stale+active+clean

>>>              133   stale+activating

>>>              67    activating+remapped

>>>              32    stale+activating+degraded

>>>              21    stale+activating+remapped

>>>              9     stale+active+undersized

>>>              6     down+remapped

>>>              5     stale+activating+undersized+degraded+remapped

>>>              2     activating+degraded+remapped

>>>              1     stale+activating+degraded+remapped

>>>              1     stale+active+undersized+degraded

>>>              1     remapped+peering

>>>              1     active+clean+remapped

>>>              1     stale+remapped+peering

>>>              1     stale+peering

>>>              1     activating+undersized+degraded+remapped

>>>

>>>   io:

>>>     client:   0 B/s rd, 23566 B/s wr, 0 op/s rd, 3 op/s wr

>>>

>>> Thanks

>>>

>>> Arun

>>>

>>> On Fri, Jan 4, 2019 at 5:38 AM Caspar Smit <casparsmit@xxxxxxxxxxx> wrote:

>>>>

>>>> Are the numbers still decreasing?

>>>>

>>>> This one for instance:

>>>>

>>>> "3883 PGs pending on creation"

>>>>

>>>> Caspar

>>>>

>>>>

>>>> Op vr 4 jan. 2019 om 14:23 schreef Arun POONIA <arun.poonia@xxxxxxxxxxxxxxxxx>:

>>>>>

>>>>> Hi Caspar,

>>>>>

>>>>> Yes, cluster was working fine with number of PGs per OSD warning up until now. I am not sure how to recover from stale down/inactive PGs. If you happen to know about this can you let me know?

>>>>>

>>>>> Current State:

>>>>>

>>>>> [root@fre101 ~]# ceph -s

>>>>> 2019-01-04 05:22:05.942349 7f314f613700 -1 asok(0x7f31480017a0) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph-guests/ceph-client.admin.1053724.139849638091088.asok': (2) No such file or directory

>>>>>   cluster:

>>>>>     id:     adb9ad8e-f458-4124-bf58-7963a8d1391f

>>>>>     health: HEALTH_ERR

>>>>>             3 pools have many more objects per pg than average

>>>>>             505714/12392650 objects misplaced (4.081%)

>>>>>             3883 PGs pending on creation

>>>>>             Reduced data availability: 6519 pgs inactive, 1870 pgs down, 1 pg peering, 886 pgs stale

>>>>>             Degraded data redundancy: 42987/12392650 objects degraded (0.347%), 634 pgs degraded, 16 pgs undersized

>>>>>             125827 slow requests are blocked > 32 sec

>>>>>             2 stuck requests are blocked > 4096 sec

>>>>>             too many PGs per OSD (2758 > max 200)

>>>>>

>>>>>   services:

>>>>>     mon: 3 daemons, quorum ceph-mon01,ceph-mon02,ceph-mon03

>>>>>     mgr: ceph-mon03(active), standbys: ceph-mon01, ceph-mon02

>>>>>     osd: 39 osds: 39 up, 39 in; 76 remapped pgs

>>>>>     rgw: 1 daemon active

>>>>>

>>>>>   data:

>>>>>     pools:   18 pools, 54656 pgs

>>>>>     objects: 6051k objects, 10944 GB

>>>>>     usage:   21933 GB used, 50688 GB / 72622 GB avail

>>>>>     pgs:     11.927% pgs not active

>>>>>              42987/12392650 objects degraded (0.347%)

>>>>>              505714/12392650 objects misplaced (4.081%)

>>>>>              48080 active+clean

>>>>>              3885  activating

>>>>>              1111  down

>>>>>              759   stale+down

>>>>>              614   activating+degraded

>>>>>              74    activating+remapped

>>>>>              46    stale+active+clean

>>>>>              35    stale+activating

>>>>>              21    stale+activating+remapped

>>>>>              9     stale+active+undersized

>>>>>              9     stale+activating+degraded

>>>>>              5     stale+activating+undersized+degraded+remapped

>>>>>              3     activating+degraded+remapped

>>>>>              1     stale+activating+degraded+remapped

>>>>>              1     stale+active+undersized+degraded

>>>>>              1     remapped+peering

>>>>>              1     active+clean+remapped

>>>>>              1     activating+undersized+degraded+remapped

>>>>>

>>>>>   io:

>>>>>     client:   0 B/s rd, 25397 B/s wr, 4 op/s rd, 4 op/s wr

>>>>>

>>>>> I will update number of PGs per OSD once these inactive or stale PGs come online. I am not able to access VMs (VMs, Images) which are using Ceph.

>>>>>

>>>>> Thanks

>>>>> Arun

>>>>>

>>>>> On Fri, Jan 4, 2019 at 4:53 AM Caspar Smit <casparsmit@xxxxxxxxxxx> wrote:

>>>>>>

>>>>>> Hi Arun,

>>>>>>

>>>>>> How did you end up with a 'working' cluster with so many pgs per OSD?

>>>>>>

>>>>>> "too many PGs per OSD (2968 > max 200)"

>>>>>>

>>>>>> To (temporarily) allow this kind of pgs per osd you could try this:

>>>>>>

>>>>>> Change these values in the global section in your ceph.conf:

>>>>>>

>>>>>> mon max pg per osd = 200

>>>>>> osd max pg per osd hard ratio = 2

>>>>>>

>>>>>> It allows 200*2 = 400 Pgs per OSD before disabling the creation of new pgs.

>>>>>>

>>>>>> Above are the defaults (for Luminous, maybe other versions too)

>>>>>> You can check your current settings with:

>>>>>>

>>>>>> ceph daemon mon.ceph-mon01 config show |grep pg_per_osd

>>>>>>

>>>>>> Since your current pgs per osd ratio is way higher then the default you could set them to for instance:

>>>>>>

>>>>>> mon max pg per osd = 1000

>>>>>> osd max pg per osd hard ratio = 5

>>>>>>

>>>>>> Which allow for 5000 pgs per osd before disabling creation of new pgs.

>>>>>>

>>>>>> You'll need to inject the setting into the mons/osds and restart mgrs to make them active.

>>>>>>

>>>>>> ceph tell mon.* injectargs ‘--mon_max_pg_per_osd 1000’

>>>>>> ceph tell mon.* injectargs ‘--osd_max_pg_per_osd_hard_ratio 5’

>>>>>> ceph tell osd.* injectargs ‘--mon_max_pg_per_osd 1000’

>>>>>> ceph tell osd.* injectargs ‘--osd_max_pg_per_osd_hard_ratio 5’

>>>>>> restart mgrs

>>>>>>

>>>>>> Kind regards,

>>>>>> Caspar

>>>>>>

>>>>>>

>>>>>> Op vr 4 jan. 2019 om 04:28 schreef Arun POONIA <arun.poonia@xxxxxxxxxxxxxxxxx>:

>>>>>>>

>>>>>>> Hi Chris,

>>>>>>>

>>>>>>> Indeed that's what happened. I didn't set noout flag either and I did zapped disk on new server every time. In my cluster status fre201 is only new server.

>>>>>>>

>>>>>>> Current Status after enabling 3 OSDs on fre201 host.

>>>>>>>

>>>>>>> [root@fre201 ~]# ceph osd tree

>>>>>>> ID  CLASS WEIGHT   TYPE NAME       STATUS REWEIGHT PRI-AFF

>>>>>>>  -1       70.92137 root default

>>>>>>>  -2        5.45549     host fre101

>>>>>>>   0   hdd  1.81850         osd.0       up  1.00000 1.00000

>>>>>>>   1   hdd  1.81850         osd.1       up  1.00000 1.00000

>>>>>>>   2   hdd  1.81850         osd.2       up  1.00000 1.00000

>>>>>>>  -9        5.45549     host fre103

>>>>>>>   3   hdd  1.81850         osd.3       up  1.00000 1.00000

>>>>>>>   4   hdd  1.81850         osd.4       up  1.00000 1.00000

>>>>>>>   5   hdd  1.81850         osd.5       up  1.00000 1.00000

>>>>>>>  -3        5.45549     host fre105

>>>>>>>   6   hdd  1.81850         osd.6       up  1.00000 1.00000

>>>>>>>   7   hdd  1.81850         osd.7       up  1.00000 1.00000

>>>>>>>   8   hdd  1.81850         osd.8       up  1.00000 1.00000

>>>>>>>  -4        5.45549     host fre107

>>>>>>>   9   hdd  1.81850         osd.9       up  1.00000 1.00000

>>>>>>>  10   hdd  1.81850         osd.10      up  1.00000 1.00000

>>>>>>>  11   hdd  1.81850         osd.11      up  1.00000 1.00000

>>>>>>>  -5        5.45549     host fre109

>>>>>>>  12   hdd  1.81850         osd.12      up  1.00000 1.00000

>>>>>>>  13   hdd  1.81850         osd.13      up  1.00000 1.00000

>>>>>>>  14   hdd  1.81850         osd.14      up  1.00000 1.00000

>>>>>>>  -6        5.45549     host fre111

>>>>>>>  15   hdd  1.81850         osd.15      up  1.00000 1.00000

>>>>>>>  16   hdd  1.81850         osd.16      up  1.00000 1.00000

>>>>>>>  17   hdd  1.81850         osd.17      up  0.79999 1.00000

>>>>>>>  -7        5.45549     host fre113

>>>>>>>  18   hdd  1.81850         osd.18      up  1.00000 1.00000

>>>>>>>  19   hdd  1.81850         osd.19      up  1.00000 1.00000

>>>>>>>  20   hdd  1.81850         osd.20      up  1.00000 1.00000

>>>>>>>  -8        5.45549     host fre115

>>>>>>>  21   hdd  1.81850         osd.21      up  1.00000 1.00000

>>>>>>>  22   hdd  1.81850         osd.22      up  1.00000 1.00000

>>>>>>>  23   hdd  1.81850         osd.23      up  1.00000 1.00000

>>>>>>> -10        5.45549     host fre117

>>>>>>>  24   hdd  1.81850         osd.24      up  1.00000 1.00000

>>>>>>>  25   hdd  1.81850         osd.25      up  1.00000 1.00000

>>>>>>>  26   hdd  1.81850         osd.26      up  1.00000 1.00000

>>>>>>> -11        5.45549     host fre119

>>>>>>>  27   hdd  1.81850         osd.27      up  1.00000 1.00000

>>>>>>>  28   hdd  1.81850         osd.28      up  1.00000 1.00000

>>>>>>>  29   hdd  1.81850         osd.29      up  1.00000 1.00000

>>>>>>> -12        5.45549     host fre121

>>>>>>>  30   hdd  1.81850         osd.30      up  1.00000 1.00000

>>>>>>>  31   hdd  1.81850         osd.31      up  1.00000 1.00000

>>>>>>>  32   hdd  1.81850         osd.32      up  1.00000 1.00000

>>>>>>> -13        5.45549     host fre123

>>>>>>>  33   hdd  1.81850         osd.33      up  1.00000 1.00000

>>>>>>>  34   hdd  1.81850         osd.34      up  1.00000 1.00000

>>>>>>>  35   hdd  1.81850         osd.35      up  1.00000 1.00000

>>>>>>> -27        5.45549     host fre201

>>>>>>>  36   hdd  1.81850         osd.36      up  1.00000 1.00000

>>>>>>>  37   hdd  1.81850         osd.37      up  1.00000 1.00000

>>>>>>>  38   hdd  1.81850         osd.38      up  1.00000 1.00000

>>>>>>> [root@fre201 ~]#

>>>>>>> [root@fre201 ~]#

>>>>>>> [root@fre201 ~]#

>>>>>>> [root@fre201 ~]#

>>>>>>> [root@fre201 ~]#

>>>>>>> [root@fre201 ~]# ceph -s

>>>>>>>   cluster:

>>>>>>>     id:     adb9ad8e-f458-4124-bf58-7963a8d1391f

>>>>>>>     health: HEALTH_ERR

>>>>>>>             3 pools have many more objects per pg than average

>>>>>>>             585791/12391450 objects misplaced (4.727%)

>>>>>>>             2 scrub errors

>>>>>>>             2374 PGs pending on creation

>>>>>>>             Reduced data availability: 6578 pgs inactive, 2025 pgs down, 74 pgs peering, 1234 pgs stale

>>>>>>>             Possible data damage: 2 pgs inconsistent

>>>>>>>             Degraded data redundancy: 64969/12391450 objects degraded (0.524%), 616 pgs degraded, 20 pgs undersized

>>>>>>>             96242 slow requests are blocked > 32 sec

>>>>>>>             228 stuck requests are blocked > 4096 sec

>>>>>>>             too many PGs per OSD (2768 > max 200)

>>>>>>>

>>>>>>>   services:

>>>>>>>     mon: 3 daemons, quorum ceph-mon01,ceph-mon02,ceph-mon03

>>>>>>>     mgr: ceph-mon03(active), standbys: ceph-mon01, ceph-mon02

>>>>>>>     osd: 39 osds: 39 up, 39 in; 96 remapped pgs

>>>>>>>     rgw: 1 daemon active

>>>>>>>

>>>>>>>   data:

>>>>>>>     pools:   18 pools, 54656 pgs

>>>>>>>     objects: 6050k objects, 10942 GB

>>>>>>>     usage:   21900 GB used, 50721 GB / 72622 GB avail

>>>>>>>     pgs:     0.002% pgs unknown

>>>>>>>              12.050% pgs not active

>>>>>>>              64969/12391450 objects degraded (0.524%)

>>>>>>>              585791/12391450 objects misplaced (4.727%)

>>>>>>>              47489 active+clean

>>>>>>>              3670  activating

>>>>>>>              1098  stale+down

>>>>>>>              923   down

>>>>>>>              575   activating+degraded

>>>>>>>              563   stale+active+clean

>>>>>>>              105   stale+activating

>>>>>>>              78    activating+remapped

>>>>>>>              72    peering

>>>>>>>              25    stale+activating+degraded

>>>>>>>              23    stale+activating+remapped

>>>>>>>              9     stale+active+undersized

>>>>>>>              6     stale+activating+undersized+degraded+remapped

>>>>>>>              5     stale+active+undersized+degraded

>>>>>>>              4     down+remapped

>>>>>>>              4     activating+degraded+remapped

>>>>>>>              2     active+clean+inconsistent

>>>>>>>              1     stale+activating+degraded+remapped

>>>>>>>              1     stale+active+clean+remapped

>>>>>>>              1     stale+remapped+peering

>>>>>>>              1     remapped+peering

>>>>>>>              1     unknown

>>>>>>>

>>>>>>>   io:

>>>>>>>     client:   0 B/s rd, 208 kB/s wr, 22 op/s rd, 22 op/s wr

>>>>>>>

>>>>>>>

>>>>>>>

>>>>>>> Thanks

>>>>>>> Arun

>>>>>>>

>>>>>>>

>>>>>>> On Thu, Jan 3, 2019 at 7:19 PM Chris <bitskrieg@xxxxxxxxxxxxx> wrote:

>>>>>>>>

>>>>>>>> If you added OSDs and then deleted them repeatedly without waiting for replication to finish as the cluster attempted to re-balance across them, its highly likely that you are permanently missing PGs (especially if the disks were zapped each time).

>>>>>>>>

>>>>>>>> If those 3 down OSDs can be revived there is a (small) chance that you can right the ship, but 1400pg/OSD is pretty extreme.  I'm surprised the cluster even let you do that - this sounds like a data loss event.

>>>>>>>>

>>>>>>>> Bring back the 3 OSD and see what those 2 inconsistent pgs look like with ceph pg query.

>>>>>>>>

>>>>>>>> On January 3, 2019 21:59:38 Arun POONIA <arun.poonia@xxxxxxxxxxxxxxxxx> wrote:

>>>>>>>>>

>>>>>>>>> Hi,

>>>>>>>>>

>>>>>>>>> Recently I tried adding a new node (OSD) to ceph cluster using ceph-deploy tool. Since I was experimenting with tool and ended up deleting OSD nodes on new server couple of times.

>>>>>>>>>

>>>>>>>>> Now since ceph OSDs are running on new server cluster PGs seems to be inactive (10-15%) and they are not recovering or rebalancing. Not sure what to do. I tried shutting down OSDs on new server.

>>>>>>>>>

>>>>>>>>> Status:

>>>>>>>>> [root@fre105 ~]# ceph -s

>>>>>>>>> 2019-01-03 18:56:42.867081 7fa0bf573700 -1 asok(0x7fa0b80017a0) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph-guests/ceph-client.admin.4018644.140328258509136.asok': (2) No such file or directory

>>>>>>>>>   cluster:

>>>>>>>>>     id:     adb9ad8e-f458-4124-bf58-7963a8d1391f

>>>>>>>>>     health: HEALTH_ERR

>>>>>>>>>             3 pools have many more objects per pg than average

>>>>>>>>>             373907/12391198 objects misplaced (3.018%)

>>>>>>>>>             2 scrub errors

>>>>>>>>>             9677 PGs pending on creation

>>>>>>>>>             Reduced data availability: 7145 pgs inactive, 6228 pgs down, 1 pg peering, 2717 pgs stale

>>>>>>>>>             Possible data damage: 2 pgs inconsistent

>>>>>>>>>             Degraded data redundancy: 178350/12391198 objects degraded (1.439%), 346 pgs degraded, 1297 pgs undersized

>>>>>>>>>             52486 slow requests are blocked > 32 sec

>>>>>>>>>             9287 stuck requests are blocked > 4096 sec

>>>>>>>>>             too many PGs per OSD (2968 > max 200)

>>>>>>>>>

>>>>>>>>>   services:

>>>>>>>>>     mon: 3 daemons, quorum ceph-mon01,ceph-mon02,ceph-mon03

>>>>>>>>>     mgr: ceph-mon03(active), standbys: ceph-mon01, ceph-mon02

>>>>>>>>>     osd: 39 osds: 36 up, 36 in; 51 remapped pgs

>>>>>>>>>     rgw: 1 daemon active

>>>>>>>>>

>>>>>>>>>   data:

>>>>>>>>>     pools:   18 pools, 54656 pgs

>>>>>>>>>     objects: 6050k objects, 10941 GB

>>>>>>>>>     usage:   21727 GB used, 45308 GB / 67035 GB avail

>>>>>>>>>     pgs:     13.073% pgs not active

>>>>>>>>>              178350/12391198 objects degraded (1.439%)

>>>>>>>>>              373907/12391198 objects misplaced (3.018%)

>>>>>>>>>              46177 active+clean

>>>>>>>>>              5054  down

>>>>>>>>>              1173  stale+down

>>>>>>>>>              1084  stale+active+undersized

>>>>>>>>>              547   activating

>>>>>>>>>              201   stale+active+undersized+degraded

>>>>>>>>>              158   stale+activating

>>>>>>>>>              96    activating+degraded

>>>>>>>>>              46    stale+active+clean

>>>>>>>>>              42    activating+remapped

>>>>>>>>>              34    stale+activating+degraded

>>>>>>>>>              23    stale+activating+remapped

>>>>>>>>>              6     stale+activating+undersized+degraded+remapped

>>>>>>>>>              6     activating+undersized+degraded+remapped

>>>>>>>>>              2     activating+degraded+remapped

>>>>>>>>>              2     active+clean+inconsistent

>>>>>>>>>              1     stale+activating+degraded+remapped

>>>>>>>>>              1     stale+active+clean+remapped

>>>>>>>>>              1     stale+remapped

>>>>>>>>>              1     down+remapped

>>>>>>>>>              1     remapped+peering

>>>>>>>>>

>>>>>>>>>   io:

>>>>>>>>>     client:   0 B/s rd, 208 kB/s wr, 28 op/s rd, 28 op/s wr

>>>>>>>>>

>>>>>>>>> Thanks

>>>>>>>>> --

>>>>>>>>> Arun Poonia

>>>>>>>>>

>>>>>>>>> _______________________________________________

>>>>>>>>> ceph-users mailing list

>>>>>>>>> ceph-users@xxxxxxxxxxxxxx

>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>>>>>>>>>

>>>>>>>>

>>>>>>>

>>>>>>>

>>>>>>> --

>>>>>>> Arun Poonia

>>>>>>>

>>>>>>> _______________________________________________

>>>>>>> ceph-users mailing list

>>>>>>> ceph-users@xxxxxxxxxxxxxx

>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>>>>>>

>>>>>> _______________________________________________

>>>>>> ceph-users mailing list

>>>>>> ceph-users@xxxxxxxxxxxxxx

>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>>>>>

>>>>>

>>>>>

>>>>> --

>>>>> Arun Poonia

>>>>>

>>>> _______________________________________________

>>>> ceph-users mailing list

>>>> ceph-users@xxxxxxxxxxxxxx

>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>>>

>>>

>>>

>>> --

>>> Arun Poonia

>>>

>>

>>

>> --

>> Arun Poonia

>>

>

>

> --

> Arun Poonia

>

> _______________________________________________

> ceph-users mailing list

> ceph-users@xxxxxxxxxxxxxx

> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Arun Poonia

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com