Re: PG inactive - why?

Eugen Block <eblock@xxxxxx> · Sat, 29 Oct 2022 08:55:04 +0000

Do you see anything in the OSD logs why the PGs were inactive? Did you  
hit the max PGs per OSD limit? What crush rule is in place for the  
affected pool? Your osd tree could help as well to figure out what  
happened.

Zitat von Paweł Kowalski <pk@xxxxxxxxxxxx>:

Hello,

2 PG's were stuck in "inactive state". I'm trying to find out why  
this happened. Here's what I found in logs (only what I think is  
important):

2022-10-27T22:39:29.707362+0200 mgr.skarb (mgr.40364478) 152930 :  
cluster [DBG] pgmap v154892: 410 pgs: 1 active+remapped+backfilling,  
2 active+clean+scrubbing+deep, 1 active+clean+snaptrim, 24  
active+remapped+backfill_wait, 382 active+clean; 3.5 TiB data, 11  
TiB used, 19 TiB / 30 TiB avail; 11 KiB/s rd, 1.1 MiB/s wr, 168  
op/s; 142367/2859735 objects misplaced (4.978%); 12 MiB/s, 3  
objects/s recovering
2022-10-27T22:39:30.266560+0200 mon.skarb (mon.0) 4717148 : cluster  
[DBG] osdmap e18492: 9 total, 9 up, 9 in
[...]
2022-10-27T22:39:31.317055+0200 osd.3 (osd.3) 106245 : cluster [DBG]  
2.51 starting backfill to osd.7 from (0'0,0'0] MAX to 18489'135867589
2022-10-27T22:39:31.524742+0200 osd.3 (osd.3) 106246 : cluster [INF]  
2.31 continuing backfill to osd.2 from  
(18485'135835568,18490'135837516] MIN to 18490'135837516
2022-10-27T22:39:31.713775+0200 mgr.skarb (mgr.40364478) 152931 :  
cluster [DBG] pgmap v154895: 409 pgs: 3 peering, 1  
active+remapped+backfilling, 2 active+clean+scrubbing+deep, 24  
active+remapped+backfil
l_wait, 379 active+clean; 3.5 TiB data, 11 TiB used, 19 TiB / 30 TiB  
avail; 5.8 KiB/s rd, 874 KiB/s wr, 108 op/s; 142367/2859732 objects  
misplaced (4.978%); 10 MiB/s, 3 objects/s recovering
[...]

2022-10-27T22:39:33.333564+0200 mon.skarb (mon.0) 4717153 : cluster  
[DBG] osdmap e18495: 9 total, 9 up, 9 in
2022-10-27T22:39:33.337963+0200 osd.2 (osd.2) 62236 : cluster [DBG]  
2.77 starting backfill to osd.1 from (0'0,0'0] MAX to 18489'149579766
2022-10-27T22:39:33.338240+0200 osd.2 (osd.2) 62237 : cluster [DBG]  
2.37 starting backfill to osd.1 from (0'0,0'0] MAX to 18489'149493224
2022-10-27T22:39:33.718993+0200 mgr.skarb (mgr.40364478) 152932 :  
cluster [DBG] pgmap v154898: 409 pgs: 13 activating+remapped, 3  
remapped+peering, 3 peering, 2 active+remapped+backfilling, 2  
active+clea
n+scrubbing+deep, 23 active+remapped+backfill_wait, 363  
active+clean; 3.5 TiB data, 11 TiB used, 19 TiB / 30 TiB avail; 804  
KiB/s wr, 98 op/s; 208558/2859732 objects misplaced (7.293%); 35  
MiB/s, 8 objec
ts/s recovering
[...]

2022-10-27T22:39:51.784172+0200 mgr.skarb (mgr.40364478) 152941 :  
cluster [DBG] pgmap v154913: 408 pgs: 2  
remapped+premerge+backfill_wait+peered, 3  
active+remapped+backfilling, 2 active+clean+scrubbing+d
eep, 87 active+remapped+backfill_wait, 314 active+clean; 3.5 TiB  
data, 11 TiB used, 19 TiB / 30 TiB avail; 451185/2858388 objects  
misplaced (15.785%)
[...]

2022-10-27T22:40:00.000177+0200 mon.skarb (mon.0) 4717169 : cluster  
[INF] overall HEALTH_OK
2022-10-27T22:40:01.821923+0200 mgr.skarb (mgr.40364478) 152946 :  
cluster [DBG] pgmap v154918: 408 pgs: 2  
remapped+premerge+backfill_wait+peered, 3  
active+remapped+backfilling, 2 active+clean+scrubbing+d
eep, 87 active+remapped+backfill_wait, 314 active+clean; 3.5 TiB  
data, 11 TiB used, 19 TiB / 30 TiB avail; 117 MiB/s rd, 150 MiB/s  
wr, 5.92k op/s; 450997/2859732 objects misplaced (15.771%); 264  
MiB/s, 7
4 objects/s recovering
[...]

2022-10-27T22:40:32.115085+0200 mon.skarb (mon.0) 4717180 : cluster  
[WRN] Health check failed: Reduced data availability: 2 pgs inactive  
(PG_AVAILABILITY)
2022-10-27T22:40:34.038597+0200 mgr.skarb (mgr.40364478) 152962 :  
cluster [DBG] pgmap v154934: 408 pgs: 2  
remapped+premerge+backfill_wait+peered, 3  
active+remapped+backfilling, 2 active+clean+scrubbing+deep, 87  
active+remapped+backfill_wait, 314 active+clean; 3.5 TiB data, 11  
TiB used, 19 TiB / 30 TiB avail; 6.0 KiB/s rd, 3.2 MiB/s wr, 209  
op/s; 450255/2859732 objects misplaced (15.745%); 44 MiB/s, 14  
objects/s recovering
[...]

2022-10-27T22:49:59.062366+0200 mgr.skarb (mgr.40364478) 153243 :  
cluster [DBG] pgmap v155219: 408 pgs: 1 active+clean+scrubbing+deep,  
2 remapped+premerge+backfill_wait+peered, 4 active+remapped+backfill
ing, 84 active+remapped+backfill_wait, 317 active+clean; 3.5 TiB  
data, 11 TiB used, 19 TiB / 30 TiB avail; 4.6 KiB/s rd, 1.6 MiB/s  
wr, 108 op/s; 438424/2859736 objects misplaced (15.331%); 57 MiB/s,  
16 o
bjects/s recovering
2022-10-27T22:50:00.000198+0200 mon.skarb (mon.0) 4717320 : cluster  
[WRN] Health detail: HEALTH_WARN Reduced data availability: 2 pgs  
inactive
2022-10-27T22:50:00.000245+0200 mon.skarb (mon.0) 4717321 : cluster  
[WRN] [WRN] PG_AVAILABILITY: Reduced data availability: 2 pgs inactive
2022-10-27T22:50:00.000263+0200 mon.skarb (mon.0) 4717322 : cluster  
[WRN]     pg 2.37 is stuck inactive for 10m, current state  
remapped+premerge+backfill_wait+peered, last acting [2,6,3]
2022-10-27T22:50:00.000307+0200 mon.skarb (mon.0) 4717323 : cluster  
[WRN]     pg 2.77 is stuck inactive for 10m, current state  
remapped+premerge+backfill_wait+peered, last acting [2,6,3]
2022-10-27T22:50:01.069589+0200 mgr.skarb (mgr.40364478) 153244 :  
cluster [DBG] pgmap v155220: 408 pgs: 1 active+clean+scrubbing+deep,  
2 remapped+premerge+backfill_wait+peered, 4 active+remapped+backfill
ing, 84 active+remapped+backfill_wait, 317 active+clean; 3.5 TiB  
data, 11 TiB used, 19 TiB / 30 TiB avail; 4.6 KiB/s rd, 1.3 MiB/s  
wr, 94 op/s; 438413/2859736 objects misplaced (15.331%); 50 MiB/s,  
14 ob
jects/s recovering

It recovered a few hours later:

2022-10-28T03:43:22.268113+0200 mgr.skarb (mgr.40364478) 162018 :  
cluster [DBG] pgmap v164234: 408 pgs: 1  
remapped+premerge+backfilling+peered, 1 clean+premerge+peered, 1  
active+clean+snaptrim, 42 active+remapped+backfill_wait, 363  
active+clean; 3.5 TiB data, 11 TiB used, 19 TiB / 30 TiB avail; 1.3  
KiB/s rd, 105 KiB/s wr, 22 op/s; 233157/2841886 objects misplaced  
(8.204%); 28 MiB/s, 7 objects/s recovering
2022-10-28T03:43:24.269793+0200 mgr.skarb (mgr.40364478) 162019 :  
cluster [DBG] pgmap v164235: 408 pgs: 1  
remapped+premerge+backfilling+peered, 1 clean+premerge+peered, 1  
active+clean+snaptrim, 42 active+remapped+backfill_wait, 363  
active+clean; 3.5 TiB data, 11 TiB used, 19 TiB / 30 TiB avail; 1.3  
KiB/s rd, 91 KiB/s wr, 19 op/s; 233157/2841886 objects misplaced  
(8.204%); 19 MiB/s, 5 objects/s recovering
2022-10-28T03:43:26.271601+0200 mgr.skarb (mgr.40364478) 162020 :  
cluster [DBG] pgmap v164236: 408 pgs: 1  
remapped+premerge+backfilling+peered, 1 clean+premerge+peered, 1  
active+clean+snaptrim, 42 active+remapped+backfill_wait, 363  
active+clean; 3.5 TiB data, 11 TiB used, 19 TiB / 30 TiB avail; 1.3  
KiB/s rd, 77 KiB/s wr, 17 op/s; 233157/2841877 objects misplaced  
(8.204%); 28 MiB/s, 7 objects/s recovering
2022-10-28T03:43:28.273799+0200 mgr.skarb (mgr.40364478) 162021 :  
cluster [DBG] pgmap v164237: 408 pgs: 1 active+remapped+backfilling,  
1 remapped+premerge+backfilling+peered, 1 clean+premerge+peered, 1  
active+clean+snaptrim, 41 active+remapped+backfill_wait, 363  
active+clean; 3.5 TiB data, 11 TiB used, 19 TiB / 30 TiB avail; 1.3  
KiB/s rd, 64 KiB/s wr, 14 op/s; 233157/2841877 objects misplaced  
(8.204%); 18 MiB/s, 4 objects/s recovering
2022-10-28T03:43:28.379923+0200 mon.skarb (mon.0) 4719355 : cluster  
[DBG] osdmap e18746: 9 total, 9 up, 9 in
2022-10-28T03:43:30.381806+0200 mon.skarb (mon.0) 4719356 : cluster  
[WRN] Health check update: Reduced data availability: 1 pg inactive  
(PG_AVAILABILITY)
2022-10-28T03:43:30.389645+0200 mon.skarb (mon.0) 4719357 : cluster  
[DBG] osdmap e18747: 9 total, 9 up, 9 in
2022-10-28T03:43:30.275554+0200 mgr.skarb (mgr.40364478) 162022 :  
cluster [DBG] pgmap v164239: 407 pgs: 1 active+remapped+backfilling,  
1 clean+premerge+peered, 1 active+clean+snaptrim, 41  
active+remapped+backfill_wait, 363 active+clean; 3.5 TiB data, 11  
TiB used, 19 TiB / 30 TiB avail; 233157/2837083 objects misplaced  
(8.218%)

And here is an overview of this situation:

root@skarb:/var/log/ceph# cat ceph.log.1.gz | gunzip | grep inactive
2022-10-27T22:40:32.115085+0200 mon.skarb (mon.0) 4717180 : cluster  
[WRN] Health check failed: Reduced data availability: 2 pgs inactive  
(PG_AVAILABILITY)
2022-10-27T22:50:00.000198+0200 mon.skarb (mon.0) 4717320 : cluster  
[WRN] Health detail: HEALTH_WARN Reduced data availability: 2 pgs  
inactive
2022-10-27T22:50:00.000245+0200 mon.skarb (mon.0) 4717321 : cluster  
[WRN] [WRN] PG_AVAILABILITY: Reduced data availability: 2 pgs inactive
2022-10-27T22:50:00.000263+0200 mon.skarb (mon.0) 4717322 : cluster  
[WRN]     pg 2.37 is stuck inactive for 10m, current state  
remapped+premerge+backfill_wait+peered, last acting [2,6,3]
2022-10-27T22:50:00.000307+0200 mon.skarb (mon.0) 4717323 : cluster  
[WRN]     pg 2.77 is stuck inactive for 10m, current state  
remapped+premerge+backfill_wait+peered, last acting [2,6,3]
2022-10-27T23:00:00.000173+0200 mon.skarb (mon.0) 4717460 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive
2022-10-27T23:10:00.000149+0200 mon.skarb (mon.0) 4717592 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive
2022-10-27T23:20:00.000140+0200 mon.skarb (mon.0) 4717718 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive
2022-10-27T23:30:00.000198+0200 mon.skarb (mon.0) 4717855 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive
2022-10-27T23:40:00.000133+0200 mon.skarb (mon.0) 4717988 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive
2022-10-27T23:50:00.000102+0200 mon.skarb (mon.0) 4718119 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive
2022-10-28T00:00:00.000129+0200 mon.skarb (mon.0) 4718259 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive
root@skarb:/var/log/ceph# cat ceph.log | grep inactive
2022-10-28T00:10:00.000139+0200 mon.skarb (mon.0) 4718393 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive
2022-10-28T00:20:00.000265+0200 mon.skarb (mon.0) 4718525 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive
2022-10-28T00:30:00.004136+0200 mon.skarb (mon.0) 4718661 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive
2022-10-28T00:40:00.000131+0200 mon.skarb (mon.0) 4718741 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive
2022-10-28T00:50:00.000130+0200 mon.skarb (mon.0) 4718774 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive
2022-10-28T01:00:00.000094+0200 mon.skarb (mon.0) 4718805 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive
2022-10-28T01:10:00.000168+0200 mon.skarb (mon.0) 4718838 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive
2022-10-28T01:20:00.000146+0200 mon.skarb (mon.0) 4718873 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive
2022-10-28T01:30:00.000152+0200 mon.skarb (mon.0) 4718904 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive
2022-10-28T01:40:00.000151+0200 mon.skarb (mon.0) 4718935 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive
2022-10-28T01:50:00.000144+0200 mon.skarb (mon.0) 4718967 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive
2022-10-28T02:00:00.000170+0200 mon.skarb (mon.0) 4719004 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive
2022-10-28T02:10:00.000158+0200 mon.skarb (mon.0) 4719047 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive
2022-10-28T02:20:00.000109+0200 mon.skarb (mon.0) 4719078 : cluster  
[WRN] Health detail: HEALTH_WARN Reduced data availability: 2 pgs  
inactive; 3 pgs not deep-scrubbed in time
2022-10-28T02:20:00.000151+0200 mon.skarb (mon.0) 4719079 : cluster  
[WRN] [WRN] PG_AVAILABILITY: Reduced data availability: 2 pgs inactive
2022-10-28T02:20:00.000170+0200 mon.skarb (mon.0) 4719080 : cluster  
[WRN]     pg 2.37 is stuck inactive for 3h, current state  
remapped+premerge+backfill_wait+peered, last acting [2,6,3]
2022-10-28T02:20:00.000185+0200 mon.skarb (mon.0) 4719081 : cluster  
[WRN]     pg 2.77 is stuck inactive for 3h, current state  
remapped+premerge+backfill_wait+peered, last acting [2,6,3]
2022-10-28T02:30:00.000152+0200 mon.skarb (mon.0) 4719118 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive;  
3 pgs not deep-scrubbed in time
2022-10-28T02:40:00.000141+0200 mon.skarb (mon.0) 4719150 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive;  
3 pgs not deep-scrubbed in time
2022-10-28T02:50:00.000163+0200 mon.skarb (mon.0) 4719178 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive;  
3 pgs not deep-scrubbed in time
2022-10-28T03:00:00.000148+0200 mon.skarb (mon.0) 4719211 : cluster  
[WRN] Health detail: HEALTH_WARN Reduced data availability: 2 pgs  
inactive; 4 pgs not deep-scrubbed in time
2022-10-28T03:00:00.000172+0200 mon.skarb (mon.0) 4719212 : cluster  
[WRN] [WRN] PG_AVAILABILITY: Reduced data availability: 2 pgs inactive
2022-10-28T03:00:00.000186+0200 mon.skarb (mon.0) 4719213 : cluster  
[WRN]     pg 2.37 is stuck inactive for 4h, current state  
remapped+premerge+backfill_wait+peered, last acting [2,6,3]
2022-10-28T03:00:00.000196+0200 mon.skarb (mon.0) 4719214 : cluster  
[WRN]     pg 2.77 is stuck inactive for 4h, current state  
remapped+premerge+backfill_wait+peered, last acting [2,6,3]
2022-10-28T03:10:00.000128+0200 mon.skarb (mon.0) 4719252 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive;  
4 pgs not deep-scrubbed in time
2022-10-28T03:20:00.000143+0200 mon.skarb (mon.0) 4719280 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive;  
4 pgs not deep-scrubbed in time
2022-10-28T03:30:00.000150+0200 mon.skarb (mon.0) 4719305 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive;  
4 pgs not deep-scrubbed in time
2022-10-28T03:40:00.000151+0200 mon.skarb (mon.0) 4719341 : cluster  
[WRN] overall HEALTH_WARN Reduced data availability: 2 pgs inactive;  
4 pgs not deep-scrubbed in time
2022-10-28T03:43:30.381806+0200 mon.skarb (mon.0) 4719356 : cluster  
[WRN] Health check update: Reduced data availability: 1 pg inactive  
(PG_AVAILABILITY)
2022-10-28T03:43:32.444590+0200 mon.skarb (mon.0) 4719360 : cluster  
[INF] Health check cleared: PG_AVAILABILITY (was: Reduced data  
availability: 1 pg inactive)

There are 9 OSD's, all of them were up all the time. But the last  
9-th one was added just 2 days ago, and just before that 2 OSD's  
from one host were destroyed and recreated. When the problem  
occurred there was no PG marked as undersized (it managed to  
recreate 3rd replica by that time). However, as you can see here,  
PG's are being merged due to autoscaling which is turned on.

Also, since the new OSD was a nvme device, there were some minor  
changes made to crush map (default rule was limited to use only hdd  
OSDs, and new rule was created to pick nvme as the first OSD for new  
pool).

So, what happened here and how to prevent it from happening again?

Paweł
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx