Re: radosgw stopped working

Laimis Juzeliūnas <laimis.juzeliunas@xxxxxxxxxx> · Sun, 22 Dec 2024 20:13:55 +0200

I see, correct plankton will only consider clean+active pgs.
It looks like the pgs on your output are remapped+backfilling, but not
degraded or undersized. In that case you can try cancelling the upmaps (and
backfills) with './upmap-remapped.py | sh' and then retry movement
plankton. It usually takes a few hits of the script to clean up all the
backfills.

Upmap-remapped can be found here:
https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/upmap-remapped.py

Best,
Laimis J.

On Sun, Dec 22, 2024, 20:00 Rok Jaklič <rjaklic@xxxxxxxxx> wrote:

> Got the same output for osd.122 and 122:
> [root@ctplmon1 plankton-swarm]# bash ./plankton-swarm.sh source-osds 122
> 3
> Using custom source OSDs: 122
> Underused OSDs (<65%):
> 63,108,143,144,142,140,146,200,141,148,199,195,198,194,196,103,197,158,147,19,125,191,164,25,126,54,145,46,68,193,192,157,134,131,6,15,128,3,60,33,53,129,87,0,102,43,78,127,160,81,119,178,120,79,5,190,72,132,74,156,114,82,70,8,58,24,18,29,49,130,76,96,116,71,48,26,84,50,170,175,165,28,186,184,44,39,110,182,4,98,115,88,75,86,159,187,22,35,83,40,171,64,95,152,111,150,94,41,17,104,52,14,30,27,173,2,37,9,105,101,124,62,32,45,149,172,176,1,168,177,118,106,100,59
> Will now find ways to move 3 pgs in each OSD respecting node failure
> domain.
> Processing OSD 122...
> dumped all
> No active and clean pgs found for 122, skipping.
> Balance pgs commands written to swarm-file - review and let planktons
> swarm with 'bash swarm-file'.
>
> ---
>
> Got
>
> [root@ctplmon1 plankton-swarm]# ceph pg dump | grep ",122]"
> 9.3c      501445                   0         0     271415        0
>  625758213726            0           0  10071         0     10071
>            active+remapped+backfilling  2024-12-22T18:34:45.510544+0100
>  319600'7529350    319600:22988194    [132,7,155,95,181]         132
>  [84,171,11,95,122]              84    316281'7471014
>  2024-12-18T07:08:00.062519+0100    315831'7421752
>  2024-12-15T09:24:50.719535+0100              0                 3835
>  queued for deep scrub
>  501028                0
> 9.16      500984                   0         0     431251        0
>  626842925755            0           0   9152         0      9152
>            active+remapped+backfilling  2024-12-22T13:15:21.725290+0100
>  319600'7542047    319600:26275250     [161,2,90,76,187]         161
> [161,2,90,76,122]             161    316161'7465208
>  2024-12-17T11:35:38.075158+0100    316161'7465208
>  2024-12-17T11:35:38.075158+0100              0                26367
>  queued for scrub
> 503474                0
> 9.0       501542                   0         0     410101        0
>  627802839022            0           0  10076         0     10076
>            active+remapped+backfilling  2024-12-22T14:28:09.575782+0100
>  319600'7510533    319600:25805615    [150,109,38,67,21]         150
> [150,109,38,67,122]             150    316283'7453888
>  2024-12-18T07:33:10.588541+0100    316283'7453888
>  2024-12-18T07:33:10.588541+0100              0                21089
>  queued for scrub
> 501183                0
> 9.10      501310                   0         0      81473        0
>  626139264394            0           0  10071         0     10071
>            active+remapped+backfilling  2024-12-22T13:12:55.859829+0100
>  319600'7533517    319600:29592030   [121,183,59,160,93]         121
> [157,40,71,177,122]             157    316247'7472839
>  2024-12-18T01:34:50.713697+0100    314908'7370591
>  2024-12-11T21:07:36.550701+0100              0                  606
>  queued for deep scrub
>  501175                0
> 9.14c     502664                   0         0      63089        0
>  631327664452            0           0  10012         0     10012
>            active+remapped+backfilling  2024-12-21T23:10:58.757342+0100
>  319600'7529494    319600:29438730   [174,51,192,12,122]         174
> [174,51,183,12,122]             174    316298'7471325
>  2024-12-18T10:38:34.981914+0100    316298'7471325
>  2024-12-18T10:38:34.981914+0100              0                12758
>  queued for scrub
> 502128                0
> 9.169     500251                   0         0     454708        0
>  624120095931            0           0  10081         0     10081
>            active+remapped+backfilling  2024-12-22T15:18:50.038950+0100
>  319600'7569821    319600:25614989    [99,150,69,10,187]          99
>  [99,150,69,10,122]              99    316069'7480721
>  2024-12-16T21:34:22.271783+0100    316069'7480721
>  2024-12-16T21:34:22.271783+0100              0                59482
>  queued for scrub
> 501140                0
>
> Its backfilling 122, that is why there is no output for  grep -P
> 'active\+clean(?!\+)' afterwards.
>
> Rok
>
>
> On Sun, Dec 22, 2024 at 6:45 PM Laimis Juzeliūnas <
> laimis.juzeliunas@xxxxxxxxxx> wrote:
>
>> Hi Rok,
>>
>> Try running (122 instead of osd.122):
>> ./plankton-swarm.sh source-osds 122 3
>> bash swarm-file
>>
>> Will have to work on the naming conventions, apologies.
>> The pgremapper tool also will be able to help in this case.
>>
>>
>> Best,
>> Laimis J.
>>
>> On 22 Dec 2024, at 17:07, Rok Jaklič <rjaklic@xxxxxxxxx> wrote:
>>
>> No active and clean pgs found for osd.122, skipping.
>>
>>
>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx