Re: Nautilus: Decommission an OSD Node

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Dave,
It's been a few days and I haven't seen any follow up in the list so
I'm wondering if the issue is that there was a typo in your osd list?
It appears that you have 16 included again in the destination instead of 26?
"24,25,16,27,28"
I'm not familiar with the pgremapper script so I may be
misunderstanding your command.

Rich

On Thu, 2 Nov 2023 at 09:39, Dave Hall <kdhall@xxxxxxxxxxxxxx> wrote:
>
> Hello.,
>
> I've recently made the decision to gradually decommission my Nautilus
> cluster and migrate the hardware to a new Pacific or Quincy cluster. By
> gradually, I mean that as I expand the new cluster I will move (copy/erase)
> content from the old cluster to the new, making room to decommission more
> nodes and move them over.
>
> In order to do this I will, of course, need to remove OSD nodes by first
> emptying the OSDs on each node.
>
> I noticed that pgremapper (a version prior to October 2021) has a 'drain'
> subcommand that allows one to control which target OSDs would receive the
> PGs from the source OSD being drained.  This seemed like a good idea:  If
> one simply marks an OSD 'out', it's contents would be rebalanced to other
> OSDs on the same node that are still active, which seems like it would make
> a lot of unnecessary data movement and also make removing the next OSD take
> longer.
>
> So I went through the trouble of creating a 'really long' pgremapper drain
> command excluding the OSDs of two nodes as targets:
>
> # bin/pgremapper drain 16 --target-osds
> 00,01,02,03,04,05,06,07,24,25,16,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71
> --allow-movement-across host  --max-source-backfills 75 --concurrency 20
> --verbose --yes
>
>
> However, when this is complete OSD 16 actually contains more PGs than
> before I started.  It appears that the mapping generated by pgremapper also
> back-filled the OSD as it was draining it.
>
> So did I miss something here?  What is the best way to proceed?  I
> understand that it would be mayhem to mark 8 of 72 OSDs out and then turn
> backfill/rebalance/recover back on.  But it seems like there should be a
> better way.
>
> Suggestions?
>
> Thanks.
>
> -Dave
>
> --
> Dave Hall
> Binghamton University
> kdhall@xxxxxxxxxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux