Re: Inactive PGs rebuild is not priorized

Bartlomiej Swiecki <bartlomiej.swiecki@xxxxxxxxxxxx> · Mon, 5 Feb 2018 11:55:16 +0000

Hi Nico,

What Ceph version are you running? There were changes in recovery priorities merged into jewel 10.2.7+ and luminous which should cover exactly this case.

Regards,
Bartek

> Wiadomość napisana przez Nico Schottelius <nico.schottelius@xxxxxxxxxxx> w dniu 03.02.2018, o godz. 12:55:
> 
> 
> Good morning,
> 
> after another disk failure, we currently have 7 inactive pgs [1], which
> are stalling IO from the affected VMs.
> 
> It seems that ceph, when rebuilding does not focus on repairing
> the inactive PGs first, which surprised us quite a lot:
> 
> It does not repair the inactive first, but mixes inactive with
> active+undersized+degraded+remapped+backfill_wait.
> 
> Is this a misconfiguration on our side or a design aspect of ceph?
> 
> I have attached ceph -s from three times while rebuilding below.
> 
> First the number of active+undersized+degraded+remapped+backfill_wait.
> decreases and much later then
> undersized+degraded+remapped+backfill_wait+peered decreases
> 
> If anyone could comment on this, I would be very thankful to know how to
> progress here, as we had 6 disk failures this week and each time we had
> inactive pgs that stalled the VM i/o.
> 
> Best,
> 
> Nico
> 
> 
> [1]
>  cluster:
>    id:     26c0c5a8-d7ce-49ac-b5a7-bfd9d0ba81ab
>    health: HEALTH_WARN
>            108752/3920931 objects misplaced (2.774%)
>            Reduced data availability: 7 pgs inactive
>            Degraded data redundancy: 419786/3920931 objects degraded (10.706%), 147 pgs unclean, 140 pgs degraded, 140 pgs und
> ersized
> 
>  services:
>    mon: 3 daemons, quorum server5,server3,server2
>    mgr: server5(active), standbys: server3, server2
>    osd: 53 osds: 52 up, 52 in; 147 remapped pgs
> 
>  data:
>    pools:   2 pools, 1280 pgs
>    objects: 1276k objects, 4997 GB
>    usage:   13481 GB used, 26853 GB / 40334 GB avail
>    pgs:     0.547% pgs not active
>             419786/3920931 objects degraded (10.706%)
>             108752/3920931 objects misplaced (2.774%)
>             1133 active+clean
>             108  active+undersized+degraded+remapped+backfill_wait
>             25   active+undersized+degraded+remapped+backfilling
>             7    active+remapped+backfill_wait
>             6    undersized+degraded+remapped+backfilling+peered
>             1    undersized+degraded+remapped+backfill_wait+peered
> 
>  io:
>    client:   29980 B/s rd, 1111 kB/s wr, 17 op/s rd, 74 op/s wr
>    recovery: 71727 kB/s, 17 objects/s
> 
> [2]
> 
> [11:20:15] server3:~# ceph -s
>  cluster:
>    id:     26c0c5a8-d7ce-49ac-b5a7-bfd9d0ba81ab
>    health: HEALTH_WARN
>            103908/3920967 objects misplaced (2.650%)
>            Reduced data availability: 7 pgs inactive
>            Degraded data redundancy: 380860/3920967 objects degraded (9.713%), 144 pgs unclean, 137 pgs degraded, 137 pgs undersized
> 
>  services:
>    mon: 3 daemons, quorum server5,server3,server2
>    mgr: server5(active), standbys: server3, server2
>    osd: 53 osds: 52 up, 52 in; 144 remapped pgs
> 
>  data:
>    pools:   2 pools, 1280 pgs
>    objects: 1276k objects, 4997 GB
>    usage:   13630 GB used, 26704 GB / 40334 GB avail
>    pgs:     0.547% pgs not active
>             380860/3920967 objects degraded (9.713%)
>             103908/3920967 objects misplaced (2.650%)
>             1136 active+clean
>             105  active+undersized+degraded+remapped+backfill_wait
>             25   active+undersized+degraded+remapped+backfilling
>             7    active+remapped+backfill_wait
>             6    undersized+degraded+remapped+backfilling+peered
>             1    undersized+degraded+remapped+backfill_wait+peered
> 
>  io:
>    client:   40201 B/s rd, 1189 kB/s wr, 16 op/s rd, 74 op/s wr
>    recovery: 54519 kB/s, 13 objects/s
> 
> 
> [3]
> 
> 
>  cluster:
>    id:     26c0c5a8-d7ce-49ac-b5a7-bfd9d0ba81ab
>    health: HEALTH_WARN
>            88382/3921066 objects misplaced (2.254%)
>            Reduced data availability: 4 pgs inactive
>            Degraded data redundancy: 285528/3921066 objects degraded (7.282%), 127 pgs unclean
> , 121 pgs degraded, 115 pgs undersized
>            14 slow requests are blocked > 32 sec
> 
>  services:
>    mon: 3 daemons, quorum server5,server3,server2
>    mgr: server5(active), standbys: server3, server2
>    osd: 53 osds: 52 up, 52 in; 121 remapped pgs
> 
>  data:
>    pools:   2 pools, 1280 pgs
>    objects: 1276k objects, 4997 GB
>    usage:   14014 GB used, 26320 GB / 40334 GB avail
>    pgs:     0.313% pgs not active
>             285528/3921066 objects degraded (7.282%)
>             88382/3921066 objects misplaced (2.254%)
>             1153 active+clean
>             78   active+undersized+degraded+remapped+backfill_wait
>             33   active+undersized+degraded+remapped+backfilling
>             6    active+recovery_wait+degraded
>             6    active+remapped+backfill_wait
>             2    undersized+degraded+remapped+backfill_wait+peered
>             2    undersized+degraded+remapped+backfilling+peered
> 
>  io:
>    client:   56370 B/s rd, 5304 kB/s wr, 11 op/s rd, 44 op/s wr
>    recovery: 37838 kB/s, 9 objects/s
> 
> 
> And our tree:
> 
> [12:53:57] server4:~# ceph osd tree
> ID CLASS WEIGHT   TYPE NAME        STATUS REWEIGHT PRI-AFF
> -1       39.84532 root default
> -6        7.28383     host server1
> 25   hdd  4.59999         osd.25       up  1.00000 1.00000
> 48   ssd  0.22198         osd.48       up  1.00000 1.00000
> 49   ssd  0.22198         osd.49       up  1.00000 1.00000
> 50   ssd  0.22198         osd.50       up  1.00000 1.00000
> 51   ssd  0.22699         osd.51       up  1.00000 1.00000
> 52   ssd  0.22198         osd.52       up  1.00000 1.00000
> 53   ssd  0.22198         osd.53       up  1.00000 1.00000
> 54   ssd  0.22198         osd.54       up  1.00000 1.00000
> 55   ssd  0.22699         osd.55       up  1.00000 1.00000
> 56   ssd  0.22198         osd.56       up  1.00000 1.00000
> 57   ssd  0.22198         osd.57       up  1.00000 1.00000
> 58   ssd  0.22699         osd.58       up  1.00000 1.00000
> 59   ssd  0.22699         osd.59       up  1.00000 1.00000
> -2       11.95193     host server2
> 21   hdd  4.59999         osd.21       up  1.00000 1.00000
> 24   hdd  4.59999         osd.24       up  1.00000 1.00000
> 0   ssd  0.68799         osd.0        up  1.00000 1.00000
> 4   ssd  0.68799         osd.4        up  1.00000 1.00000
> 6   ssd  0.68799         osd.6        up  1.00000 1.00000
> 10   ssd  0.68799         osd.10       up  1.00000 1.00000
> -3        6.71286     host server3
> 17   hdd  0.09999         osd.17       up  1.00000 1.00000
> 20   hdd  4.59999         osd.20     down        0 1.00000
> 1   ssd  0.22198         osd.1        up  1.00000 1.00000
> 7   ssd  0.22198         osd.7        up  1.00000 1.00000
> 12   ssd  0.22198         osd.12       up  1.00000 1.00000
> 15   ssd  0.22699         osd.15       up  1.00000 1.00000
> 23   ssd  0.22198         osd.23       up  1.00000 1.00000
> 27   ssd  0.22198         osd.27       up  1.00000 1.00000
> 29   ssd  0.22699         osd.29       up  1.00000 1.00000
> 33   ssd  0.22198         osd.33       up  1.00000 1.00000
> 42   ssd  0.22699         osd.42       up  1.00000 1.00000
> -5        6.61287     host server4
> 31   hdd  4.59999         osd.31       up  1.00000 1.00000
> 3   ssd  0.22198         osd.3        up  1.00000 1.00000
> 11   ssd  0.22198         osd.11       up  1.00000 1.00000
> 16   ssd  0.22699         osd.16       up  1.00000 1.00000
> 19   ssd  0.22198         osd.19       up  1.00000 1.00000
> 28   ssd  0.22198         osd.28       up  1.00000 1.00000
> 37   ssd  0.22198         osd.37       up  1.00000 1.00000
> 41   ssd  0.22198         osd.41       up  1.00000 1.00000
> 43   ssd  0.22699         osd.43       up  1.00000 1.00000
> 46   ssd  0.22699         osd.46       up  1.00000 1.00000
> -4        7.28383     host server5
> 8   hdd  4.59999         osd.8        up  1.00000 1.00000
> 2   ssd  0.22198         osd.2        up  1.00000 1.00000
> 5   ssd  0.22198         osd.5        up  1.00000 1.00000
> 9   ssd  0.22198         osd.9        up  1.00000 1.00000
> 14   ssd  0.22699         osd.14       up  1.00000 1.00000
> 18   ssd  0.22198         osd.18       up  1.00000 1.00000
> 22   ssd  0.22198         osd.22       up  1.00000 1.00000
> 26   ssd  0.22198         osd.26       up  1.00000 1.00000
> 30   ssd  0.22699         osd.30       up  1.00000 1.00000
> 36   ssd  0.22198         osd.36       up  1.00000 1.00000
> 40   ssd  0.22198         osd.40       up  1.00000 1.00000
> 45   ssd  0.22699         osd.45       up  1.00000 1.00000
> 47   ssd  0.22699         osd.47       up  1.00000 1.00000
> [12:54:13] server4:~#
> 
> 
> 
> --
> Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com