Aleksei-
This won't be a ceph answer. Most virtualization platforms you will have a type of disk called ephemeral, it is usually storage composed of disks on the hypervisor, possibly RAID with parity, usually not backed up. You may want to consider running your Cassandra instances on the ephemeral storage, this would alleviate the data redundancy at the application and storage level for the Cassandra service. Then keep backups of your Cassandra db on the Ceph storage. There are some benefits and drawbacks, the main benefit will probably be a latency decrease. You will need to evaluate the hypervisors you are running on, disk layout, etc.
-Jamie
On Thu, Feb 8, 2018 at 9:36 AM, Aleksei Gutikov <aleksey.gutikov@xxxxxxxxxx> wrote:
Hi all.
We use RBDs as storage of data for applications.
If application itself can do replication (for example Cassandra),
we want to get profit (HA) from replication on app level.
But we can't if all RBDs are in same pool.
If all RBDs are in same pool - then all rbds are tied up with one set of PGs.
And if for any reason even single PG was damaged and for example stuck inactive - then all RBDs will be affected.
First that come to mind is to create a separate pool for every RBD.
I'm aware of max number of PGs per OSD and about osd_pool_default_pg_num
that should be reasonable.
So max number of pools == osds_num * pgs_per_osd / min_pool_pgs.
For example 1000 osds * 300 pg per osd / 32 pgs per pool = 9375.
If osd size 1T then average RBD size will be 100G (looks sane).
So my question is: is there any theoretical limit of pools per cluster?
And, maybe, what it depends on?
Thanks.
--
Best regards,
Aleksei Gutikov
Software Engineer | synesis.ru | Minsk. BY
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com