DC 4 EC 4+5 with 4 servers make sense?

Torkil Svensgaard <torkil@xxxxxxxx> · Mon, 10 Mar 2025 08:41:38 +0100

Hi

We currently have a 3 DC setup with 6 HDD servers in each of 3 DCs, 
running replica 3 and EC 4+5 pools. This works fine for mostly dormant 
data but of course not so well for other stuff requiring low latency 
etc. Mostly big RBDs shared by kernel NFS with a bit of CephFS.

We want to start migrating some of these big RBD backed NFS shares to 
NVMe. We currently have 3 E3.S servers, one in each DC, with an EC 4+5 
pool with DC->OSD crush selection. We are wondering how to cheapest 
increase redundancy short term since we do not have enterprise money.

With what we have we can lose 1 DC (aka host in this case) and 1 OSD and 
still be online, but there's nowhere to backfill and some hardware 
problems, like a faulty mainboard or CPU can take down an entire "DC". 
Ideally we would have several more hosts in each DC so we had 
DC->Host->OSD but that's not an option.

I am thinking it could make sense to add just 1 more server in a 4th DC 
and keep the 4+5 rule as is, simply giving us a buffer of 1 more failed 
host before we hit problems. Thoughts?

Mvh.

Torkil

--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hospital
Kettegård Allé 30
DK-2650 Hvidovre
Denmark
Tel: +45 386 22828
E-mail: torkil@xxxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx