Re: PG stuck in active+clean+remapped

huang jun <hjwsm1989@xxxxxxxxx> · Sun, 31 Mar 2019 18:28:25 +0800

seems like the crush cannot get enough osds for this pg,
what the output of 'ceph osd crush dump' and especially the 'tunables'
section values?

Vladimir Prokofev <v@xxxxxxxxxxx> 于2019年3月27日周三 上午4:02写道：
>
> CEPH 12.2.11, pool size 3, min_size 2.
>
> One node went down today(private network interface started flapping, and after a while OSD processes crashed), no big deal, cluster recovered, but not completely. 1 PG stuck in active+clean+remapped state.
>
> PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES       LOG  DISK_LOG STATE                 STATE_STAMP                VERSION         REPORTED        UP         UP_PRIMARY ACTING     ACTING_PRIMARY LAST_SCRUB      SCRUB_STAMP                LAST_DEEP_SCRUB DEEP_SCRUB_STAMP           SNAPTRIMQ_LEN
> 20.a2       511                  0        0       511       0  1584410172 1500     1500 active+clean+remapped 2019-03-26 20:50:18.639452    96149'189204    96861:935872    [26,14]         26  [26,14,9]             26    96149'189204 2019-03-26 10:47:36.174769    95989'187669 2019-03-22 23:29:02.322848             0
>
> it states it's placed on 26,14 OSDs, should be on 26,14,9. As far as I can see there's nothing wrong with any of those OSDs, they work, host other PGs, peer with each other, etc. I tried restarting all of them one after another, but without any success.
> OSD 9 hosts 95 other PGs, don't think it's PG overdose.
>
> Last line of log from osd.9 mentioning PG 20.a2:
> 2019-03-26 20:50:16.294500 7fe27963a700  1 osd.9 pg_epoch: 96860 pg[20.a2( v 96149'189204 (95989'187645,96149'189204] local-lis/les=96857/96858 n=511 ec=39164/39164 lis/c 96857/96855 les/c/f 96858/96856/66611 96859/96860/96855) [26,14]/[26,14,9] r=2 lpr=96860 pi=[96855,96860)/1 crt=96149'189204 lcod 0'0 remapped NOTIFY mbc={}] state<Start>: transitioning to Stray
>
> Nothing else out of ordinary, just usual scrubs/deep-scrubs notifications.
> Any ideas what it it can be, or any other steps to troubleshoot this?
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Thank you!
HuangJun
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com