Re: jewel - recovery keeps stalling (continues after restarting OSDs)

Nikola Ciprich <nikola.ciprich@xxxxxxxxxxx> · Tue, 8 Aug 2017 07:39:20 +0200

Hi,

I tried balancing number of OSDs per node, set their weights the same,
increased op recovery priority, but it still takes ages to recover..

I've got my cluster OK now, so I'll try switching to kraken to see if
it behaves better..

nik

On Mon, Aug 07, 2017 at 11:36:10PM +0800, cgxu wrote:
> I encountered same issue today and I solved problem by adjusting "osd recovery op priority” to 63 temporarily.
> 
> It looks like recovery PUSH/PULL op starved in op_wq prioritized queue and I’ve never experienced in hammer version.
> 
> Any other idea? 
> 
> 
> > Hi,
> > 
> > I'm trying to find reason for strange recovery issues I'm seeing on
> > our cluster..
> > 
> > it's mostly idle, 4 node cluster with 26 OSDs evenly distributed
> > across nodes. jewel 10.2.9
> > 
> > the problem is that after some disk replaces and data moves, recovery
> > is progressing extremely slowly.. pgs seem to be stuck in 
> > active+recovering+degraded
> > state:
> > 
> > [root@v1d ~]# ceph -s
> >     cluster a5efbc87-3900-4c42-a977-8c93f7aa8c33
> >      health HEALTH_WARN
> >             159 pgs backfill_wait
> >             4 pgs backfilling
> >             259 pgs degraded
> >             12 pgs recovering
> >             113 pgs recovery_wait
> >             215 pgs stuck degraded
> >             266 pgs stuck unclean
> >             140 pgs stuck undersized
> >             151 pgs undersized
> >             recovery 37788/2327775 objects degraded (1.623%)
> >             recovery 23854/2327775 objects misplaced (1.025%)
> >             noout,noin flag(s) set
> >      monmap e21: 3 mons at 
> > {v1a=10.0.0.1:6789/0,v1b=10.0.0.2:6789/0,v1c=10.0.0.3:6789/0}
> >             election epoch 6160, quorum 0,1,2 v1a,v1b,v1c
> >       fsmap e817: 1/1/1 up {0=v1a=up:active}, 1 up:standby
> >      osdmap e76002: 26 osds: 26 up, 26 in; 185 remapped pgs
> >             flags noout,noin,sortbitwise,require_jewel_osds
> >       pgmap v80995844: 3200 pgs, 4 pools, 2876 GB data, 757 kobjects
> >             9215 GB used, 35572 GB / 45365 GB avail
> >             37788/2327775 objects degraded (1.623%)
> >             23854/2327775 objects misplaced (1.025%)
> >                 2912 active+clean
> >                  130 active+undersized+degraded+remapped+wait_backfill
> >                   97 active+recovery_wait+degraded
> >                   29 active+remapped+wait_backfill
> >                   12 active+recovery_wait+undersized+degraded+remapped
> >                    6 active+recovering+degraded
> >                    5 active+recovering+undersized+degraded+remapped
> >                    4 active+undersized+degraded+remapped+backfilling
> >                    4 active+recovery_wait+degraded+remapped
> >                    1 active+recovering+degraded+remapped
> >   client io 2026 B/s rd, 146 kB/s wr, 9 op/s rd, 21 op/s wr
> > 
> > 
> >  when I restart affected OSDs, it bumps the recovery, but then another
> > PGs get stuck.. All OSDs were restarted multiple times, none are even close to
> > nearfull, I just cant find what I'm doing wrong..
> > 
> > possibly related OSD options:
> > 
> > osd max backfills = 4
> > osd recovery max active = 15
> > debug osd = 0/0
> > osd op threads = 4
> > osd backfill scan min = 4
> > osd backfill scan max = 16
> > 
> > Any hints would be greatly appreciated
> > 
> > thanks
> > 
> > nik
> > 
> > 
> > -- 
> > -------------------------------------
> > Ing. Nikola CIPRICH
> > LinuxBox.cz, s.r.o.
> > 28.rijna 168, 709 00 Ostrava
> > 
> > tel.:   +420 591 166 214
> > fax:    +420 596 621 273
> > mobil:  +420 777 093 799
> > www.linuxbox.cz <http://www.linuxbox.cz/>
> > 
> > mobil servis: +420 737 238 656
> > email servis: ser...@xxxxxxxxxxx
> > -------------------------------------
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
> > 
> 

-- 
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava

tel.:   +420 591 166 214
fax:    +420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: servis@xxxxxxxxxxx
-------------------------------------
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com