Re: Data recovery stuck

Brad Hubbard <bhubbard@xxxxxxxxxx> · Sat, 9 Jul 2016 10:18:42 +1000

On Sat, Jul 9, 2016 at 1:20 AM, Pisal, Ranjit Dnyaneshwar
<ranjit.dny.pisal@xxxxxxx> wrote:
> Hi All,
>
>
>
> I am in process of adding new OSDs to Cluster however after adding second
> node Cluster recovery seems to be stopped.
>
>
>
> Its more than 3 days but Objects degraded % has not improved even by 1%.
>
>
>
> Will adding further OSDs help improve situation or is there any other way to
> improve recovery process?
>
>
>
>
>
> [ceph@MYOPTPDN01 ~]$ ceph -s
>
>     cluster 9e3e9015-f626-4a44-83f7-0a939ef7ec02
>
>      health HEALTH_WARN 315 pgs backfill; 23 pgs backfill_toofull; 3 pgs

You have 23 pgs that are "backfill_toofull". You need to identify these pgs.

You could try increasing the backfill full ratio for those pgs.

ceph health detail
ceph tell osd.<id> injectargs '--osd-backfill-full-ratio=0.90'

Keep in mind new storage needs to be added to the cluster as soon as possible
but I guess that's what you are trying to do.

You could also look at reweighting the full OSDs if you have other OSDs with
considerable space available.

HTH,
Brad

> backfilling; 53 pgs degraded; 2 pgs recovering; 232 pgs recovery_wait; 552
> pgs stuck unclean; recovery 3622384/90976826 objects degraded (3.982%); 1
> near full osd(s)
>
>      monmap e4: 5 mons at
> {MYOPTPDN01=10.115.1.136:6789/0,MYOPTPDN02=10.115.1.137:6789/0,MYOPTPDN03=10.115.1.138:6789/0,MYOPTPDN04=10.115.1.139:6789/0,MYOPTPDN05=10.115.1.140:6789/0},
> election epoch 6654, quorum 0,1,2,3,4
> MYOPTPDN01,MYOPTPDN02,MYOPTPDN03,MYOPTPDN04,MYOPTPDN05
>
>      osdmap e198079: 171 osds: 171 up, 171 in
>
>       pgmap v26428186: 5696 pgs, 4 pools, 105 TB data, 28526 kobjects
>
>             329 TB used, 136 TB / 466 TB avail
>
>             3622384/90976826 objects degraded (3.982%)
>
>                   23 active+remapped+wait_backfill+backfill_toofull
>
>                  120 active+recovery_wait+remapped
>
>                 5144 active+clean
>
>                    1 active+recovering+remapped
>
>                  104 active+recovery_wait
>
>                   45 active+degraded+remapped+wait_backfill
>
>                    1 active+recovering
>
>                    3 active+remapped+backfilling
>
>                  247 active+remapped+wait_backfill
>
>                    8 active+recovery_wait+degraded+remapped
>
>   client io 62143 kB/s rd, 100 MB/s wr, 14427 op/s
>
> [ceph@MYOPTPDN01 ~]$
>
>
>
> Best Regards,
>
> Ranjit
>
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

-- 
Cheers,
Brad
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com