Re: Experience reducing size 3 to 2 on production cluster?

Marco Pizzolo <marcopizzolo@xxxxxxxxx> · Tue, 14 Dec 2021 16:56:31 -0500

Hi Martin,

Agreed on the min_size of 2.  I have no intention of worrying about uptime
in event of a host failure.  Once size of 2 is effectuated (and I'm unsure
how long it will take), it is our intention to evacuate all OSDs in one of
4 hosts, in order to migrate the host to the new cluster, where its OSDs
will then be added in.  Once added and balanced, we will complete the
copies (<3 days) and then migrate one more host allowing us to bring size
to 3.  Once balanced, we will collapse the last 2 nodes into the new
cluster.  I am hoping that inclusive of rebalancing the whole project will
only take 3 weeks, but time will tell.

Has anyone asked Ceph to reduce hundreds of millions if not billions of
files from size 3 to size 2, and if so, were you successful?  I know it
*should* be able to do this, but sometimes theory and practice don't
perfectly overlap.

Thanks,
Marco

On Sat, Dec 11, 2021 at 4:37 AM Martin Verges <martin.verges@xxxxxxxx>
wrote:

> Hello,
>
> avoid size 2 whenever you can. As long as you know that you might lose
> data, it can be an acceptable risk while migrating the cluster. We had that
> in the past multiple time and it is a valid use case in our opinion.
> However make sure to monitor the state and recover as fast as possible.
> Leave min_size on 2 as well and accept the potential downtime!
>
> --
> Martin Verges
> Managing director
>
> Mobile: +49 174 9335695  | Chat: https://t.me/MartinVerges
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492
> Com. register: Amtsgericht Munich HRB 231263
> Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx
>
>
> On Fri, 10 Dec 2021 at 18:05, Marco Pizzolo <marcopizzolo@xxxxxxxxx>
> wrote:
>
>> Hello,
>>
>> As part of a migration process where we will be swinging Ceph hosts from
>> one cluster to another we need to reduce the size from 3 to 2 in order to
>> shrink the footprint sufficiently to allow safe removal of an OSD/Mon
>> node.
>>
>> The cluster has about 500M objects as per dashboard, and is about 1.5PB in
>> size comprised solely of small files served through CephFS to Samba.
>>
>> Has anyone encountered a similar situation?  What (if any) problems did
>> you
>> face?
>>
>> Ceph 14.2.22 bare metal deployment on Centos.
>>
>> Thanks in advance.
>>
>> Marco
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx