Re: Problem with query and any operation on PGs

Łukasz Chrustek <skidoo@xxxxxxx> · Wed, 24 May 2017 16:18:15 +0200

Cześć,

> On Wed, 24 May 2017, Łukasz Chrustek wrote:
>> Cześć,
>> 
>> > On Wed, 24 May 2017, Łukasz Chrustek wrote:
>> >> Cześć,
>> >> 
>> >> > On Tue, 23 May 2017, Łukasz Chrustek wrote:
>> >> >> Cześć,
>> >> >> 
>> >> >> > On Tue, 23 May 2017, Łukasz Chrustek wrote:
>> >> >> >> I'm  not  sleeping for over 30 hours, and still can't find solution. I
>> >> >> >> did,      as      You      wrote,     but     turning     off     this
>> >> >> >> (https://pastebin.com/1npBXeMV) osds didn't resolve issue...
>> >> >> 
>> >> >> > The important bit is:
>> >> >> 
>> >> >> >             "blocked": "peering is blocked due to down osds",
>> >> >> >             "down_osds_we_would_probe": [
>> >> >> >                 6,
>> >> >> >                 10,
>> >> >> >                 33,
>> >> >> >                 37,
>> >> >> >                 72
>> >> >> >             ],
>> >> >> >             "peering_blocked_by": [
>> >> >> >                 {
>> >> >> >                     "osd": 6,
>> >> >> >                     "current_lost_at": 0,
>> >> >> >                     "comment": "starting or marking this osd lost may let
>> >> >> > us proceed"
>> >> >> >                 },
>> >> >> >                 {
>> >> >> >                     "osd": 10,
>> >> >> >                     "current_lost_at": 0,
>> >> >> >                     "comment": "starting or marking this osd lost may let
>> >> >> > us proceed"
>> >> >> >                 },
>> >> >> >                 {
>> >> >> >                     "osd": 37,
>> >> >> >                     "current_lost_at": 0,
>> >> >> >                     "comment": "starting or marking this osd lost may let
>> >> >> > us proceed"
>> >> >> >                 },
>> >> >> >                 {
>> >> >> >                     "osd": 72,
>> >> >> >                     "current_lost_at": 113771,
>> >> >> >                     "comment": "starting or marking this osd lost may let
>> >> >> > us proceed"

> These are the osds (6, 10, 37, 72).

>> >> >> >                 }
>> >> >> >             ]
>> >> >> >         },
>> >> >> 
>> >> >> > Are any of those OSDs startable?

> This

osd 6 - isn't startable

osd 10, 37, 72 are startable

>> >> >> 
>> >> >> They were all up and running - but I decided to shut them down and out
>> >> >> them  from  ceph, now it looks like ceph working ok, but still two PGs
>> >> >> are in down state, how to get rid of it ?
>> >> 
>> >> > If you haven't deleted the data, you should start the OSDs back up.

> This

By OSDs backup You mean copy /var/lib/ceph/osd/ceph-72/* to some other
(non ceph) disk ?

>> >> 
>> >> > If they are partially damanged you can use ceph-objectstore-tool to 
>> >> > extract just the PGs in question to make sure you haven't lost anything,
>> >> > inject them on some other OSD(s) and restart those, and *then* mark the
>> >> > bad OSDs as 'lost'.
>> >> 
>> >> > If all else fails, you can just mark those OSDs 'lost', but in doing so
>> >> > you might be telling the cluster to lose data.
>> >> 
>> >> > The best thing to do is definitely to get those OSDs started again.

> This

There were actions on this PGs, that make them destroy. I started this
osds   (these  three,  which  are  startable)  -  this  dosn't  solved
situation.  I  need to add, that on this cluster are other pools, only
with pool with broken/down PGs is problem.
>> >> 
>> >> Now situation looks like this:
>> >> 
>> >> [root@cc1 ~]# rbd info volumes/volume-ccc5d976-cecf-4938-a452-1bee6188987b
>> >> rbd image 'volume-ccc5d976-cecf-4938-a452-1bee6188987b':
>> >>         size 500 GB in 128000 objects
>> >>         order 22 (4096 kB objects)
>> >>         block_name_prefix: rbd_data.ed9d394a851426
>> >>         format: 2
>> >>         features: layering
>> >>         flags:
>> >> 
>> >> [root@cc1 ~]# rados -p volumes ls | grep rbd_data.ed9d394a851426
>> >> (output cutted)
>> >> rbd_data.ed9d394a851426.000000000000447c
>> >> rbd_data.ed9d394a851426.0000000000010857
>> >> rbd_data.ed9d394a851426.000000000000ec8b
>> >> rbd_data.ed9d394a851426.000000000000fa43
>> >> rbd_data.ed9d394a851426.000000000001ef2d
>> >> ^C
>> >> 
>> >> it hangs on this object and isn't going further. rbd cp also hangs...
>> >> rbd map - also...
>> >> 
>> >> can  You advice what can be solution for this case ?
>> 
>> > The hang is due to OSD throttling (see my first reply for how to wrok 
>> > around that and get a pg query).  But you already did that and the cluster
>> > told you which OSDs it needs to see up in order for it to peer and 
>> > recover.  If you haven't destroyed those disks, you should start those

>> > osds and it shoudl be fine.  If you've destroyed the data or the disks are
>> > truly broken and dead, then you can mark those OSDs lost and the cluster
>> > *maybe* recover (but hard to say given the information you've shared).

> This

[root@cc1 ~]# ceph osd lost 10 --yes-i-really-mean-it
marked osd lost in epoch 115310
[root@cc1 ~]# ceph osd lost 37 --yes-i-really-mean-it
marked osd lost in epoch 115314
[root@cc1 ~]# ceph osd lost 72 --yes-i-really-mean-it
marked osd lost in epoch 115317
[root@cc1 ~]# ceph -s
    cluster 8cdfbff9-b7be-46de-85bd-9d49866fcf60
     health HEALTH_WARN
            2 pgs down
            2 pgs peering
            2 pgs stuck inactive
     monmap e1: 3 mons at {cc1=192.168.128.1:6789/0,cc2=192.168.128.2:6789/0,cc3=192.168.128.3:6789/0}
            election epoch 872, quorum 0,1,2 cc1,cc2,cc3
     osdmap e115434: 100 osds: 89 up, 86 in; 1 remapped pgs
      pgmap v67642483: 4032 pgs, 18 pools, 26713 GB data, 4857 kobjects
            76718 GB used, 107 TB / 182 TB avail
                4030 active+clean
                   1 down+remapped+peering
                   1 down+peering
  client io 14624 kB/s rd, 31619 kB/s wr, 382 op/s rd, 228 op/s wr
[root@cc1 ~]# ceph -s
    cluster 8cdfbff9-b7be-46de-85bd-9d49866fcf60
     health HEALTH_WARN
            2 pgs down
            2 pgs peering
            2 pgs stuck inactive
     monmap e1: 3 mons at {cc1=192.168.128.1:6789/0,cc2=192.168.128.2:6789/0,cc3=192.168.128.3:6789/0}
            election epoch 872, quorum 0,1,2 cc1,cc2,cc3
     osdmap e115434: 100 osds: 89 up, 86 in; 1 remapped pgs
      pgmap v67642485: 4032 pgs, 18 pools, 26713 GB data, 4857 kobjects
            76718 GB used, 107 TB / 182 TB avail
                4030 active+clean
                   1 down+remapped+peering
                   1 down+peering
  client io 17805 kB/s rd, 18787 kB/s wr, 215 op/s rd, 107 op/s wr

>> 
>> > sage
>> 
>> What information I can bring to You to say it is recoverable ?
>> 
>> here are ceph -s and ceph health detail:
>> 
>> [root@cc1 ~]# ceph -s
>>     cluster 8cdfbff9-b7be-46de-85bd-9d49866fcf60
>>      health HEALTH_WARN
>>             2 pgs down
>>             2 pgs peering
>>             2 pgs stuck inactive
>>      monmap e1: 3 mons at {cc1=192.168.128.1:6789/0,cc2=192.168.128.2:6789/0,cc3=192.168.128.3:6789/0}
>>             election epoch 872, quorum 0,1,2 cc1,cc2,cc3
>>      osdmap e115431: 100 osds: 89 up, 86 in; 1 remapped pgs
>>       pgmap v67641261: 4032 pgs, 18 pools, 26706 GB data, 4855 kobjects
>>             76705 GB used, 107 TB / 182 TB avail
>>                 4030 active+clean
>>                    1 down+remapped+peering
>>                    1 down+peering
>>   client io 5704 kB/s rd, 24685 kB/s wr, 49 op/s rd, 165 op/s wr
>> [root@cc1 ~]# ceph health detail
>> HEALTH_WARN 2 pgs down; 2 pgs peering; 2 pgs stuck inactive
>> pg 1.165 is stuck inactive since forever, current state down+peering, last acting [67,88,48]
>> pg 1.60 is stuck inactive since forever, current state down+remapped+peering, last acting [66,40]
>> pg 1.60 is down+remapped+peering, acting [66,40]
>> pg 1.165 is down+peering, acting [67,88,48]
>> [root@cc1 ~]#
>> 
>> -- 
>> Regards,
>>  Łukasz Chrustek
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> 

-- 
Pozdrowienia,
 Łukasz Chrustek

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html