Re: fault tolerant about erasure code pool

Frank Schilder <frans@xxxxxx> · Sun, 28 Jun 2020 10:03:08 +0000

I think in your case I would just go with failure domain OSD until you have enough servers to change that.

If you want to have a bit better uptime, with 3 servers you could consider EC 4+2 and use a "simple" technique to have at most 2 shards per physical host. We had the same issue, needed availability under maintenance but couldn't afford the necessary server count.

Physical host and logical host, that is, crush host bucket, are two independent things. It is possible to create additional host buckets and define in ceph.conf which OSD shows up in such buckets with entries like

[osd.0]
crush location = "host=ceph-21"

Host ceph-21 does not yet exist as a physical host in our cluster. You will also need to set min_size=k(=4) until you get additional hosts.

What I found most useful is per host to use the hostname of the physical host and an intended hostname of a future host. Once you get the new host, physically move the OSDs belonging to this host bucket to the future host, remove the ceph.conf entries and you are done + there will be no rebalancing either because the crush map itself does not change.

This will require very careful bookkeeping though. Also, a drawback is that you need the OSD ID, which implies that OSDs will show up first in the bucket of the physical host and need to be moved manually to the bucket of the fake host. This is additional admin workload that requires careful attention to details as OSDs wrongly configured will move themselves to the wrong bucket on restart. You also need to develop a procedure to find the physical disks associated with host buckets.

My recommendation would be to push hard for money for extra servers. The added work load and the increased chance of accidents easily costs in salary what you need in hardware.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Zhenshi Zhou <deaderzzs@xxxxxxxxx>
Sent: 28 June 2020 06:58:56
To: ceph-users
Subject:  Re: fault tolerant about erasure code pool

I'm going to try the way Janne said, but what confused me is the
expansion of the cluster.
Should I change the profile if I add hosts which don't have the same OSDs
as these 3 hosts?

Zhenshi Zhou <deaderzzs@xxxxxxxxx> 于2020年6月28日周日 下午12:53写道：

> I have only 3 hosts at present, and I tend to use EC pools because I don't
> have much budgets.
> The cluster is used for cold backup and it doesn't need high qos as well.
>
> <DHilsbos@xxxxxxxxxxxxxx> 于2020年6月26日周五 下午11:40写道：
>
>> As others have pointed out; setting the failure domain to OSD is
>> dangerous because then all 6 chunks for an object can end up on the same
>> host.  6 hosts really seems like the minimum to mess with EC pools.
>>
>> Adding a bucket type between host and osd seems like a good idea here, if
>> you absolutely must use EC pools.
>>
>> Perhaps something that corresponds to the HBAs / disk controllers?
>>
>> Thank you,
>>
>> Dominic L. Hilsbos, MBA
>> Director - Information Technology
>> Perform Air International, Inc.
>> DHilsbos@xxxxxxxxxxxxxx
>> www.PerformAir.com
>>
>>
>>
>> -----Original Message-----
>> From: Lindsay Mathieson [mailto:lindsay.mathieson@xxxxxxxxx]
>> Sent: Friday, June 26, 2020 4:08 AM
>> To: Zhenshi Zhou
>> Cc: ceph-users
>> Subject:  Re: fault tolerant about erasure code pool
>>
>> On 26/06/2020 8:08 pm, Zhenshi Zhou wrote:
>> > Hi Lindsay,
>> >
>> > I have only 3 hosts, and is there any method to set a EC pool cluster
>> > in a better way
>>
>> There's failure domain by OSD, which Janne knows far better than I :)
>>
>> --
>> Lindsay
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx