Re: Remapped PGs

ceph@xxxxxxxxxx · Tue, 11 Aug 2020 22:17:12 +0200

Hi,

I am not sure but perhaps this could be an Effekt of "balancer" module - if you use it!?

Hth
Mehmet

Am 10. August 2020 17:28:27 MESZ schrieb David Orman <ormandj@xxxxxxxxxxxx>:
>We've gotten a bit further, after evaluating how this remapped count
>was
>determine (pg_temp), we've found the PGs counted as being remapped:
>
>root@ceph01:~# ceph osd dump |grep pg_temp
>pg_temp 3.7af [93,1,29]
>pg_temp 3.7bc [137,97,5]
>pg_temp 3.7d9 [72,120,18]
>pg_temp 3.7e8 [80,21,71]
>pg_temp 3.7fd [74,51,8]
>
>Looking at 3.7af:
>root@ceph01:~# ceph pg map 3.7af
>osdmap e15406 pg 3.7af (3.f) -> up [87,156,29] acting [87,156,29]
>
>I'm unclear why this is staying in pg_temp. Is there a way to clean
>this
>up? I would have expected it to be cleaned up as per docs but I might
>be
>missing something here.
>
>On Thu, Aug 6, 2020 at 2:40 PM David Orman <ormandj@xxxxxxxxxxxx>
>wrote:
>
>> Still haven't figured this out. We went ahead and upgraded the entire
>> cluster to Podman 2.0.4 and in the process did OS/Kernel upgrades and
>> rebooted every node, one at a time. We've still got 5 PGs stuck in
>> 'remapped' state, according to 'ceph -s' but 0 in the pg dump output
>in
>> that state. Does anybody have any suggestions on what to do about
>this?
>>
>> On Wed, Aug 5, 2020 at 10:54 AM David Orman <ormandj@xxxxxxxxxxxx>
>wrote:
>>
>>> Hi,
>>>
>>> We see that we have 5 'remapped' PGs, but are unclear why/what to do
>>> about it. We shifted some target ratios for the autobalancer and it
>>> resulted in this state. When adjusting ratio, we noticed two OSDs go
>down,
>>> but we just restarted the container for those OSDs with podman, and
>they
>>> came back up. Here's status output:
>>>
>>> ###################
>>> root@ceph01:~# ceph status
>>> INFO:cephadm:Inferring fsid x
>>> INFO:cephadm:Inferring config x
>>> INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15
>>>   cluster:
>>>     id:     41bb9256-c3bf-11ea-85b9-9e07b0435492
>>>     health: HEALTH_OK
>>>
>>>   services:
>>>     mon: 5 daemons, quorum ceph01,ceph04,ceph02,ceph03,ceph05 (age
>2w)
>>>     mgr: ceph03.ytkuyr(active, since 2w), standbys: ceph01.aqkgbl,
>>> ceph02.gcglcg, ceph04.smbdew, ceph05.yropto
>>>     osd: 168 osds: 168 up (since 2d), 168 in (since 2d); 5 remapped
>pgs
>>>
>>>   data:
>>>     pools:   3 pools, 1057 pgs
>>>     objects: 18.00M objects, 69 TiB
>>>     usage:   119 TiB used, 2.0 PiB / 2.1 PiB avail
>>>     pgs:     1056 active+clean
>>>              1    active+clean+scrubbing+deep
>>>
>>>   io:
>>>     client:   859 KiB/s rd, 212 MiB/s wr, 644 op/s rd, 391 op/s wr
>>>
>>> root@ceph01:~#
>>>
>>> ###################
>>>
>>> When I look at ceph pg dump, I don't see any marked as remapped:
>>>
>>> ###################
>>> root@ceph01:~# ceph pg dump |grep remapped
>>> INFO:cephadm:Inferring fsid x
>>> INFO:cephadm:Inferring config x
>>> INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15
>>> dumped all
>>> root@ceph01:~#
>>> ###################
>>>
>>> Any idea what might be going on/how to recover? All OSDs are up.
>Health
>>> is 'OK'. This is Ceph 15.2.4 deployed using Cephadm in containers,
>on
>>> Podman 2.0.3.
>>>
>>
>_______________________________________________
>ceph-users mailing list -- ceph-users@xxxxxxx
>To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx