Re: Fwd: [Ceph-community] After Mimic upgrade OSD's stuck at booting.

Eugen Block <eblock@xxxxxx> · Tue, 25 Sep 2018 12:32:27 +0000

I would try to reduce recovery to a minimum, something like this  
helped us in in a small cluster (25 OSDs on 3 hosts) in case of  
recovery while operation continued without impact:

ceph tell 'osd.*' injectargs '--osd-recovery-max-active 2'
ceph tell 'osd.*' injectargs '--osd-max-backfills 8'

Regards,
Eugen

Zitat von by morphin <morphinwithyou@xxxxxxxxx>:

After reducing the recovery parameter values did not change much.
There are a lot of OSD still marked down.

I don't know what I need to do after this point.

[osd]
osd recovery op priority = 63
osd client op priority = 1
osd recovery max active = 1
osd max scrubs = 1

ceph -s
  cluster:
    id:     89569e73-eb89-41a4-9fc9-d2a5ec5f4106
    health: HEALTH_ERR
            42 osds down
            1 host (6 osds) down
            61/8948582 objects unfound (0.001%)
            Reduced data availability: 3837 pgs inactive, 1822 pgs
down, 1900 pgs peering, 6 pgs stale
            Possible data damage: 18 pgs recovery_unfound
            Degraded data redundancy: 457246/17897164 objects degraded
(2.555%), 213 pgs degraded, 209 pgs undersized
            2554 slow requests are blocked > 32 sec
            3273 slow ops, oldest one blocked for 1453 sec, daemons
[osd.0,osd.1,osd.10,osd.100,osd.101,osd.102,osd.103,osd.104,osd.105,osd.106]...
have slow ops.

  services:
    mon: 3 daemons, quorum SRV-SEKUARK3,SRV-SBKUARK2,SRV-SBKUARK3
    mgr: SRV-SBKUARK2(active), standbys: SRV-SEKUARK2, SRV-SEKUARK3,
SRV-SEKUARK4
    osd: 168 osds: 118 up, 160 in

  data:
    pools:   1 pools, 4096 pgs
    objects: 8.95 M objects, 17 TiB
    usage:   33 TiB used, 553 TiB / 586 TiB avail
    pgs:     93.677% pgs not active
             457246/17897164 objects degraded (2.555%)
             61/8948582 objects unfound (0.001%)
             1676 down
             1372 peering
             528  stale+peering
             164  active+undersized+degraded
             145  stale+down
             73   activating
             40   active+clean
             29   stale+activating
             17   active+recovery_unfound+undersized+degraded
             16   stale+active+clean
             16   stale+active+undersized+degraded
             9    activating+undersized+degraded
             3    active+recovery_wait+degraded
             2    activating+undersized
             2    activating+degraded
             1    creating+down
             1    stale+active+recovery_unfound+undersized+degraded
             1    stale+active+clean+scrubbing+deep
             1    stale+active+recovery_wait+degraded

ceph -w: https://paste.ubuntu.com/p/WZ2YqzS86S/
ceph health detail: https://paste.ubuntu.com/p/8w7Jpms8fj/
by morphin <morphinwithyou@xxxxxxxxx>, 25 Eyl 2018 Sal, 14:32
tarihinde şunu yazdı:

The config didnt work. Because increasing the number faced with  
more OSD Drops.

bhfs -s
  cluster:
    id:     89569e73-eb89-41a4-9fc9-d2a5ec5f4106
    health: HEALTH_ERR
            norebalance,norecover flag(s) set
            1 osds down
            17/8839434 objects unfound (0.000%)
            Reduced data availability: 3578 pgs inactive, 861 pgs
down, 1928 pgs peering, 11 pgs stale
            Degraded data redundancy: 44853/17678868 objects degraded
(0.254%), 221 pgs degraded, 20 pgs undersized
            610 slow requests are blocked > 32 sec
            3996 stuck requests are blocked > 4096 sec
            6076 slow ops, oldest one blocked for 4129 sec, daemons
[osd.0,osd.1,osd.10,osd.100,osd.101,osd.102,osd.103,osd.104,osd.105,osd.106]...
have slow ops.

  services:
    mon: 3 daemons, quorum SRV-SEKUARK3,SRV-SBKUARK2,SRV-SBKUARK3
    mgr: SRV-SBKUARK2(active), standbys: SRV-SEKUARK2, SRV-SEKUARK3
    osd: 168 osds: 128 up, 129 in; 2 remapped pgs
         flags norebalance,norecover

  data:
    pools:   1 pools, 4096 pgs
    objects: 8.84 M objects, 17 TiB
    usage:   26 TiB used, 450 TiB / 477 TiB avail
    pgs:     0.024% pgs unknown
             89.160% pgs not active
             44853/17678868 objects degraded (0.254%)
             17/8839434 objects unfound (0.000%)
             1612 peering
             720  down
             583  activating
             319  stale+peering
             255  active+clean
             157  stale+activating
             108  stale+down
             95   activating+degraded
             84   stale+active+clean
             50   active+recovery_wait+degraded
             29   creating+down
             23   stale+activating+degraded
             18   stale+active+recovery_wait+degraded
             14   active+undersized+degraded
             12   active+recovering+degraded
             4    stale+creating+down
             3    stale+active+recovering+degraded
             3    stale+active+undersized+degraded
             2    stale
             1    active+recovery_wait+undersized+degraded
             1    active+clean+scrubbing+deep
             1    unknown
             1    active+undersized+degraded+remapped+backfilling
             1    active+recovering+undersized+degraded

I guess OSD down and drop issue increases the recovery time. So I
decided to try with decreasing recovery parameters for less load on
cluster.
I have Nvme and SAS disks. Servers are powerfull enough. Network is 4x10Gb.
I dont think my cluster is a bad shape. Because I have datacenter
redundancy (14 servers + 14 servers). The crashed 7 servers are on
only datacenter A. And it took only a few minutes to back online. Also
2 of them is monitors and cluster I/O should be suspended so there
should be less data difference.

On the other hand I dont understand the burden of recovery. I have
faced many recoverys but none of the stopped my cluster working. This
recovery burden is so high that it didnt stop for hours. I wish I
could just decrease the recovery speed and continue to serve my VMs.
Is the change of recovery load some what different than mimic?
Luminous was pretty fine indeed.
by morphin <morphinwithyou@xxxxxxxxx>, 25 Eyl 2018 Sal, 13:57
tarihinde şunu yazdı:
>
> Thank you for answer
>
> What do you think the conf for speed the recover?
>
> [osd]
> osd recovery op priority = 63
> osd client op priority = 1
> osd recovery max active = 16
> osd max scrubs = 16
> <admin@xxxxxxxxxxxxxxx> adresine sahip kullanıcı 25 Eyl 2018 Sal,
> 13:37 tarihinde şunu yazdı:
> >
> > Just let it recover.
> >
> >   data:
> >     pools:   1 pools, 4096 pgs
> >     objects: 8.95 M objects, 17 TiB
> >     usage:   34 TiB used, 577 TiB / 611 TiB avail
> >     pgs:     94.873% pgs not active
> >              48475/17901254 objects degraded (0.271%)
> >              1/8950627 objects unfound (0.000%)
> >              2631 peering
> >              637  activating
> >              562  down
> >              159  active+clean
> >              44   activating+degraded
> >              30   active+recovery_wait+degraded
> >              12   activating+undersized+degraded
> >              10   active+recovering+degraded
> >              10   active+undersized+degraded
> >              1    active+clean+scrubbing+deep
> >
> > You've got deep scrubbed PGs which put considerable IO load on OSDs.
> >
> >
> > September 25, 2018 1:23 PM, "by morphin"  
<morphinwithyou@xxxxxxxxx> wrote:
> >
> >
> > > What should I do now?
> > >
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com