Re: Can I create 8+2 Erasure coding pool on 5 node?

Christian Wuerdig <christian.wuerdig@xxxxxxxxx> · Sun, 28 Mar 2021 11:14:18 +1300

Once you have your additional 5 nodes you can adjust your crushrule to have
failure domain = host and ceph will rebalance the data automatically for
you. This will involve quite a bit of data movement (at least 50% of your
data will need to be migrated) so can take some time. Also the official
recommendation is to run min_size = K+2 on an EC pool which in your case
would make your cluster unavailable for any maintenance event that takes
out a whole node. With 14 TB drives, drive failure recovery will take
significant amount of time (especially the backfill of the new drive which
can easily take a couple of weeks)
Keep in mind that recovery generally increases the pressure on cluster
resources significantly so if you design your cluster for the happy day
case only, you will suffer greatly when trouble hits (and this is a
question of "when" and not "if"). I know plenty people who would not run a
cluster with at least one node to spare on their failure domain (i.e. for
EC run at least K + M + 1 nodes or for replication N run at least N + 1
nodes). Generally people's time and mental health are worth more (or should
be) than the few $ (comparatively) you'd save in HW.

On Fri, 26 Mar 2021 at 08:17, by morphin <morphinwithyou@xxxxxxxxx> wrote:

> Thank you for the answers.
>
> But I don't have problem with setting 8+2. The problem is the expansion.
>
> I need to move the 5 node with data in it and add 5 node later because
> they're in different city. The goal I'm trying to reach is 8+2 (host
> crush rule)
> So I want to cut the data 10 pieces and put them in 5 node. After
> adding the other 5 node I want to move 5 piece of the data to new
> nodes and have 8+2 in the end.
>
> Also the data is S3 and I shouldn't break RGW multisite to continue
> later...  If I can not continue then the data will be garbage anyway.
> If I use internet for sync the data, it will take 1 month or more. Its
> easier to take 5 node in "B" and bring to "A" and create cluster, sync
> the data and move back to B :)
> Because of that I need to create 5 node cluster first with "8+2 EC",
> sync the data, move the 5 node to B datacenter, add the other 5 nodes
> later and rebalance all the data to reach 8+2 (host)
> But I really don't know it will work. I'm using Replication this is
> the first time dealing with EC setup.
>
> BTW: Every node has 20x14TB Sas and 4x900GB SSD for RGW index. ( SSD's
> are Replication 3)
>
> Dan van der Ster <dan@xxxxxxxxxxxxxx>, 25 Mar 2021 Per, 22:03
> tarihinde şunu yazdı:
> >
> > Here's a crush ruleset for 8+2 that will choose 2 osds per host:
> >
> >
> > rule cephfs_data_82 {
> >         id 4
> >         type erasure
> >         min_size 3
> >         max_size 10
> >         step set_chooseleaf_tries 5
> >         step set_choose_tries 100
> >         step take default class hdd
> >         step choose indep 5 type host
> >         step choose indep 2 type osd
> >         step emit
> > }
> >
> >
> >
> > This is kind of useful because if you set min_size to 8, you could even
> lose an entire host and stay online.
> >
> > Cheers, dan
> >
> >
> >
> >
> >
> > On Thu, Mar 25, 2021, 7:02 PM by morphin <morphinwithyou@xxxxxxxxx>
> wrote:
> >>
> >> Hello.
> >>
> >> I have 5 node Cluster in A datacenter. Also I have same 5 node in B
> datacenter.
> >> They're gonna be 10 node 8+2 EC cluster for backup but I need to add
> >> the 5 node later.
> >> I have to sync my S3 data with multisite on the 5 node cluster in A
> >> datacenter and move
> >> them to the B and add the other 5 node to the same cluster.
> >>
> >> The question is: Can I create 8+2 ec pool on 5 node cluster and add
> >> the 5 node later? How can I rebalance the data after that?
> >> Or is there any better solution in my case? what should I do?
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx