We've gotten a bit further, after evaluating how this remapped count was determine (pg_temp), we've found the PGs counted as being remapped: root@ceph01:~# ceph osd dump |grep pg_temp pg_temp 3.7af [93,1,29] pg_temp 3.7bc [137,97,5] pg_temp 3.7d9 [72,120,18] pg_temp 3.7e8 [80,21,71] pg_temp 3.7fd [74,51,8] Looking at 3.7af: root@ceph01:~# ceph pg map 3.7af osdmap e15406 pg 3.7af (3.f) -> up [87,156,29] acting [87,156,29] I'm unclear why this is staying in pg_temp. Is there a way to clean this up? I would have expected it to be cleaned up as per docs but I might be missing something here. On Thu, Aug 6, 2020 at 2:40 PM David Orman <ormandj@xxxxxxxxxxxx> wrote: > Still haven't figured this out. We went ahead and upgraded the entire > cluster to Podman 2.0.4 and in the process did OS/Kernel upgrades and > rebooted every node, one at a time. We've still got 5 PGs stuck in > 'remapped' state, according to 'ceph -s' but 0 in the pg dump output in > that state. Does anybody have any suggestions on what to do about this? > > On Wed, Aug 5, 2020 at 10:54 AM David Orman <ormandj@xxxxxxxxxxxx> wrote: > >> Hi, >> >> We see that we have 5 'remapped' PGs, but are unclear why/what to do >> about it. We shifted some target ratios for the autobalancer and it >> resulted in this state. When adjusting ratio, we noticed two OSDs go down, >> but we just restarted the container for those OSDs with podman, and they >> came back up. Here's status output: >> >> ################### >> root@ceph01:~# ceph status >> INFO:cephadm:Inferring fsid x >> INFO:cephadm:Inferring config x >> INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15 >> cluster: >> id: 41bb9256-c3bf-11ea-85b9-9e07b0435492 >> health: HEALTH_OK >> >> services: >> mon: 5 daemons, quorum ceph01,ceph04,ceph02,ceph03,ceph05 (age 2w) >> mgr: ceph03.ytkuyr(active, since 2w), standbys: ceph01.aqkgbl, >> ceph02.gcglcg, ceph04.smbdew, ceph05.yropto >> osd: 168 osds: 168 up (since 2d), 168 in (since 2d); 5 remapped pgs >> >> data: >> pools: 3 pools, 1057 pgs >> objects: 18.00M objects, 69 TiB >> usage: 119 TiB used, 2.0 PiB / 2.1 PiB avail >> pgs: 1056 active+clean >> 1 active+clean+scrubbing+deep >> >> io: >> client: 859 KiB/s rd, 212 MiB/s wr, 644 op/s rd, 391 op/s wr >> >> root@ceph01:~# >> >> ################### >> >> When I look at ceph pg dump, I don't see any marked as remapped: >> >> ################### >> root@ceph01:~# ceph pg dump |grep remapped >> INFO:cephadm:Inferring fsid x >> INFO:cephadm:Inferring config x >> INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15 >> dumped all >> root@ceph01:~# >> ################### >> >> Any idea what might be going on/how to recover? All OSDs are up. Health >> is 'OK'. This is Ceph 15.2.4 deployed using Cephadm in containers, on >> Podman 2.0.3. >> > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx