Re: Best osd scenario + ansible config?

Yoann Moulin <yoann.moulin@xxxxxxx> · Tue, 3 Sep 2019 14:01:22 +0200

Hi,

> So you need to think about failure domains.

Failure domains will be set to host.

> If you put all the DB's on one SSD and all the WAL's on another SSD then a failure of either of those SSD's will result in a failure of all the OSD's behind them. So in this case all 10 OSD's would have failed.
> 
> Splitting it to 5 OSD's you have RocksDb and WAL on each SSD this then results in a failure of an SSD only impacting 5 OSD's.
> 
> A failure of an SSD will take down all the OSD's that are behind that SSD.

That is what I wondered, thanks to confirm it.

> That's one of the reasons I would always say you need 1 nodes worth of spare capacity in the cluster to allow for automated re-builds to happen. 
> 
> As for your EC 7+5 I would have gone for some thing like 8+3 as then you have a spare node active in the cluster and can still provide full protection in the event of a failure of a node. 

Make sense! On another cluster, I have an EC 7+5 pool for cephfs but there are 4 servers per chassis. In case I lost one chassis, I still need
to access data. But for that cluster, you are right, 8+3 may be enough for redundancy.

> Think about software updates that require a reboot of a node. Any data written during that time will need recovering to bring it back to full protection where as if you have a spare node then that data could be written and not require a later recovery.

It is mostly a read-only cluster to distribute public datasets over S3 inside our network, it is fine for me if write operations are not fully
protected during a couple of days. All writes operations are managed by us to update datasets.

But as mentioned above, 8+3 may be a good compromise.

Best,

Yoann

> On 03/09/2019, 10:29, "Yoann Moulin" <yoann.moulin@xxxxxxx> wrote:
> 
>     Hello,
>     
>     I am deploying a new Nautilus cluster and I would like to know what would be the best OSD's scenario config in this case :
>     
>     10x 6TB Disk OSDs (data)
>      2x 480G SSD previously used for journal and can be used for WAL and/or DB
>     
>     Is it better to put all WAL on one SSD and all DBs on the other one? Or put WAL and DB of the first 5 OSDs on the first SSD and the 5 others on
>     the second one.
>     
>     A more general question, what is the impact on an OSD if we lose the WAL? The DB? Both?
>     
>     I plan to use EC 7+5 on 12 servers and I am OK if I lose one server temporarily. I have spare servers and I can easily add another one in this
>     cluster.
>     
>     To deploy this cluster, I use ceph-ansible (stable-4.0). I am not sure how to configure the playbook to use SSD and disks with LVM.
>     
>     https://github.com/ceph/ceph-ansible/blob/master/docs/source/osds/scenarios.rst
>     
>     Is this good?
>     
>     osd_objectstore: bluestore
>     lvm_volumes:
>       - data: data-lv1
>         data_vg: data-vg1
>         db: db-lv1
>         db_vg: db-vg1
>         wal: wal-lv1
>         wal_vg: wal-vg1
>       - data: data-lv2
>         data_vg: data-vg2
>         db: db-lv2
>         db_vg: db-vg2
>         wal: wal-lv2
>         wal_vg: wal-vg2
>     
>     
>     Is it possible to let the playbook configure LVM for each disk in a mixed case? It looks like I must configure LVM before running the playbook
>     but I am not sure if I missed something.
>     
>     Is wal_vg and db_vg can be identical (on VG per SSD shared with multiple OSDs)?
>     
>     Thanks for your help.
>     
>     Best regards,
>     
>     -- 
>     Yoann Moulin
>     EPFL IC-IT
>     _______________________________________________
>     ceph-users mailing list -- ceph-users@xxxxxxx
>     To unsubscribe send an email to ceph-users-leave@xxxxxxx
>     
>     
> 

-- 
Yoann Moulin
EPFL IC-IT
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx