Re: pool size ...

Janne Johansson <icepic.dz@xxxxxxxxx> · Sun, 16 Oct 2022 14:36:11 +0200

> Hi,
> I've seen Dan's talk:
> https://www.youtube.com/watch?v=0i7ew3XXb7Q
> and other similar ones that talk about CLUSTER size.
> But, I see nothing (perhaps I have not looked hard enough), on any recommendations regarding max POOL size.
> So, are there any limitations on a given pool that has all OSDs of the same type?
> I know that this is vague, and may depend on device type, crush rule, ec vs replicated, network bandwidth, etc. But if thereare any limitations or experiences that have exposed limits you don't want to go over, it would be nice to know.
> Also, an ancedotal 'our biggest pool is X, and we don't have problems', or, pools over Y started to show problem Z', would be great too.

We got into troubles with a 12-node cluster with 660M objects of
average size <180k on spin-only disks where it had troubles keeping
all PGs/objects in sync. Having ssd or nvme for WAL/DB might have
worked out fine, as would perhaps having lots more hosts.

In older times, we built a large cluster made up of 250+ SMR drives on
cheap atom cpus, that one crashed and burned. SMR drives are fine and
good for certain types of usage, but "ceph repair and backfills" is
not one of them, so while it would work (and perform rather well for
its price) when all was ok in the cluster, each failed drive or other
outage would have tons of OSDs flip, timeout and die during recovery
because SMR just is poor at non-linear work loads. Just don't buy SMR
unless you treat them like tapes, writing large sequential IOs against
them linearly.

-- 
May the most significant bit of your life be positive.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx