Hi Nico, What Ceph version are you running? There were changes in recovery priorities merged into jewel 10.2.7+ and luminous which should cover exactly this case. Regards, Bartek > Wiadomość napisana przez Nico Schottelius <nico.schottelius@xxxxxxxxxxx> w dniu 03.02.2018, o godz. 12:55: > > > Good morning, > > after another disk failure, we currently have 7 inactive pgs [1], which > are stalling IO from the affected VMs. > > It seems that ceph, when rebuilding does not focus on repairing > the inactive PGs first, which surprised us quite a lot: > > It does not repair the inactive first, but mixes inactive with > active+undersized+degraded+remapped+backfill_wait. > > Is this a misconfiguration on our side or a design aspect of ceph? > > I have attached ceph -s from three times while rebuilding below. > > First the number of active+undersized+degraded+remapped+backfill_wait. > decreases and much later then > undersized+degraded+remapped+backfill_wait+peered decreases > > If anyone could comment on this, I would be very thankful to know how to > progress here, as we had 6 disk failures this week and each time we had > inactive pgs that stalled the VM i/o. > > Best, > > Nico > > > [1] > cluster: > id: 26c0c5a8-d7ce-49ac-b5a7-bfd9d0ba81ab > health: HEALTH_WARN > 108752/3920931 objects misplaced (2.774%) > Reduced data availability: 7 pgs inactive > Degraded data redundancy: 419786/3920931 objects degraded (10.706%), 147 pgs unclean, 140 pgs degraded, 140 pgs und > ersized > > services: > mon: 3 daemons, quorum server5,server3,server2 > mgr: server5(active), standbys: server3, server2 > osd: 53 osds: 52 up, 52 in; 147 remapped pgs > > data: > pools: 2 pools, 1280 pgs > objects: 1276k objects, 4997 GB > usage: 13481 GB used, 26853 GB / 40334 GB avail > pgs: 0.547% pgs not active > 419786/3920931 objects degraded (10.706%) > 108752/3920931 objects misplaced (2.774%) > 1133 active+clean > 108 active+undersized+degraded+remapped+backfill_wait > 25 active+undersized+degraded+remapped+backfilling > 7 active+remapped+backfill_wait > 6 undersized+degraded+remapped+backfilling+peered > 1 undersized+degraded+remapped+backfill_wait+peered > > io: > client: 29980 B/s rd, 1111 kB/s wr, 17 op/s rd, 74 op/s wr > recovery: 71727 kB/s, 17 objects/s > > [2] > > [11:20:15] server3:~# ceph -s > cluster: > id: 26c0c5a8-d7ce-49ac-b5a7-bfd9d0ba81ab > health: HEALTH_WARN > 103908/3920967 objects misplaced (2.650%) > Reduced data availability: 7 pgs inactive > Degraded data redundancy: 380860/3920967 objects degraded (9.713%), 144 pgs unclean, 137 pgs degraded, 137 pgs undersized > > services: > mon: 3 daemons, quorum server5,server3,server2 > mgr: server5(active), standbys: server3, server2 > osd: 53 osds: 52 up, 52 in; 144 remapped pgs > > data: > pools: 2 pools, 1280 pgs > objects: 1276k objects, 4997 GB > usage: 13630 GB used, 26704 GB / 40334 GB avail > pgs: 0.547% pgs not active > 380860/3920967 objects degraded (9.713%) > 103908/3920967 objects misplaced (2.650%) > 1136 active+clean > 105 active+undersized+degraded+remapped+backfill_wait > 25 active+undersized+degraded+remapped+backfilling > 7 active+remapped+backfill_wait > 6 undersized+degraded+remapped+backfilling+peered > 1 undersized+degraded+remapped+backfill_wait+peered > > io: > client: 40201 B/s rd, 1189 kB/s wr, 16 op/s rd, 74 op/s wr > recovery: 54519 kB/s, 13 objects/s > > > [3] > > > cluster: > id: 26c0c5a8-d7ce-49ac-b5a7-bfd9d0ba81ab > health: HEALTH_WARN > 88382/3921066 objects misplaced (2.254%) > Reduced data availability: 4 pgs inactive > Degraded data redundancy: 285528/3921066 objects degraded (7.282%), 127 pgs unclean > , 121 pgs degraded, 115 pgs undersized > 14 slow requests are blocked > 32 sec > > services: > mon: 3 daemons, quorum server5,server3,server2 > mgr: server5(active), standbys: server3, server2 > osd: 53 osds: 52 up, 52 in; 121 remapped pgs > > data: > pools: 2 pools, 1280 pgs > objects: 1276k objects, 4997 GB > usage: 14014 GB used, 26320 GB / 40334 GB avail > pgs: 0.313% pgs not active > 285528/3921066 objects degraded (7.282%) > 88382/3921066 objects misplaced (2.254%) > 1153 active+clean > 78 active+undersized+degraded+remapped+backfill_wait > 33 active+undersized+degraded+remapped+backfilling > 6 active+recovery_wait+degraded > 6 active+remapped+backfill_wait > 2 undersized+degraded+remapped+backfill_wait+peered > 2 undersized+degraded+remapped+backfilling+peered > > io: > client: 56370 B/s rd, 5304 kB/s wr, 11 op/s rd, 44 op/s wr > recovery: 37838 kB/s, 9 objects/s > > > And our tree: > > [12:53:57] server4:~# ceph osd tree > ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF > -1 39.84532 root default > -6 7.28383 host server1 > 25 hdd 4.59999 osd.25 up 1.00000 1.00000 > 48 ssd 0.22198 osd.48 up 1.00000 1.00000 > 49 ssd 0.22198 osd.49 up 1.00000 1.00000 > 50 ssd 0.22198 osd.50 up 1.00000 1.00000 > 51 ssd 0.22699 osd.51 up 1.00000 1.00000 > 52 ssd 0.22198 osd.52 up 1.00000 1.00000 > 53 ssd 0.22198 osd.53 up 1.00000 1.00000 > 54 ssd 0.22198 osd.54 up 1.00000 1.00000 > 55 ssd 0.22699 osd.55 up 1.00000 1.00000 > 56 ssd 0.22198 osd.56 up 1.00000 1.00000 > 57 ssd 0.22198 osd.57 up 1.00000 1.00000 > 58 ssd 0.22699 osd.58 up 1.00000 1.00000 > 59 ssd 0.22699 osd.59 up 1.00000 1.00000 > -2 11.95193 host server2 > 21 hdd 4.59999 osd.21 up 1.00000 1.00000 > 24 hdd 4.59999 osd.24 up 1.00000 1.00000 > 0 ssd 0.68799 osd.0 up 1.00000 1.00000 > 4 ssd 0.68799 osd.4 up 1.00000 1.00000 > 6 ssd 0.68799 osd.6 up 1.00000 1.00000 > 10 ssd 0.68799 osd.10 up 1.00000 1.00000 > -3 6.71286 host server3 > 17 hdd 0.09999 osd.17 up 1.00000 1.00000 > 20 hdd 4.59999 osd.20 down 0 1.00000 > 1 ssd 0.22198 osd.1 up 1.00000 1.00000 > 7 ssd 0.22198 osd.7 up 1.00000 1.00000 > 12 ssd 0.22198 osd.12 up 1.00000 1.00000 > 15 ssd 0.22699 osd.15 up 1.00000 1.00000 > 23 ssd 0.22198 osd.23 up 1.00000 1.00000 > 27 ssd 0.22198 osd.27 up 1.00000 1.00000 > 29 ssd 0.22699 osd.29 up 1.00000 1.00000 > 33 ssd 0.22198 osd.33 up 1.00000 1.00000 > 42 ssd 0.22699 osd.42 up 1.00000 1.00000 > -5 6.61287 host server4 > 31 hdd 4.59999 osd.31 up 1.00000 1.00000 > 3 ssd 0.22198 osd.3 up 1.00000 1.00000 > 11 ssd 0.22198 osd.11 up 1.00000 1.00000 > 16 ssd 0.22699 osd.16 up 1.00000 1.00000 > 19 ssd 0.22198 osd.19 up 1.00000 1.00000 > 28 ssd 0.22198 osd.28 up 1.00000 1.00000 > 37 ssd 0.22198 osd.37 up 1.00000 1.00000 > 41 ssd 0.22198 osd.41 up 1.00000 1.00000 > 43 ssd 0.22699 osd.43 up 1.00000 1.00000 > 46 ssd 0.22699 osd.46 up 1.00000 1.00000 > -4 7.28383 host server5 > 8 hdd 4.59999 osd.8 up 1.00000 1.00000 > 2 ssd 0.22198 osd.2 up 1.00000 1.00000 > 5 ssd 0.22198 osd.5 up 1.00000 1.00000 > 9 ssd 0.22198 osd.9 up 1.00000 1.00000 > 14 ssd 0.22699 osd.14 up 1.00000 1.00000 > 18 ssd 0.22198 osd.18 up 1.00000 1.00000 > 22 ssd 0.22198 osd.22 up 1.00000 1.00000 > 26 ssd 0.22198 osd.26 up 1.00000 1.00000 > 30 ssd 0.22699 osd.30 up 1.00000 1.00000 > 36 ssd 0.22198 osd.36 up 1.00000 1.00000 > 40 ssd 0.22198 osd.40 up 1.00000 1.00000 > 45 ssd 0.22699 osd.45 up 1.00000 1.00000 > 47 ssd 0.22699 osd.47 up 1.00000 1.00000 > [12:54:13] server4:~# > > > > -- > Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com