Re: SSD Sizing for DB/WAL: 4% for large drives?

Frank Yu <flyxiaoyu@xxxxxxxxx> · Wed, 29 May 2019 11:23:57 +0800

Hi Jake, 

I have same question about size of DB/WAL for OSD。My situations:  12 osd per OSD nodes, 8 TB(maybe 12TB later) per OSD, Intel NVMe SSD (optane P4800x) 375G per OSD nodes, which means DB/WAL can use about 30GB per OSD(8TB), I mainly use CephFS to serve the HPC cluster for ML.
（plan to separate CephFS metadata to pool based on NVMe SSD, BTW, does this improve the performance a lot? any compares?)

On Wed, May 29, 2019 at 12:29 AM Jake Grimmett <jog@xxxxxxxxxxxxxxxxx> wrote:
Hi Martin,

thanks for your reply :)

We already have a separate NVMe SSD pool for cephfs metadata.

I agree it's much simpler & more robust not using a separate DB/WAL, but

as we have enough money for a 1.6TB SSD for every 6 HDD, so it's

tempting to go down that route. If people think a 2.2% DB/WAL is a bad

idea, we will definitely have a re-think.

Perhaps I'm being greedy for more performance; we have a 250 node HPC

cluster, and it would be nice to see how cephfs compares to our beegfs

scratch.

best regards,

Jake

On 5/28/19 3:14 PM, Martin Verges wrote:

> Hello Jake,

> 

> do you have any latency requirements that you do require the DB/WAL at all?

> If not, CephFS with EC on SATA HDD works quite well as long as you have

> the metadata on a separate ssd pool.

> 

> --

> Martin Verges

> Managing director

> 

> Mobile: +49 174 9335695

> E-Mail: martin.verges@xxxxxxxx <mailto:martin.verges@xxxxxxxx>

> Chat: https://t.me/MartinVerges

> 

> croit GmbH, Freseniusstr. 31h, 81247 Munich

> CEO: Martin Verges - VAT-ID: DE310638492

> Com. register: Amtsgericht Munich HRB 231263

> 

> Web: https://croit.io

> YouTube: https://goo.gl/PGE1Bx

> 

> 

> Am Di., 28. Mai 2019 um 15:13 Uhr schrieb Jake Grimmett

> <jog@xxxxxxxxxxxxxxxxx <mailto:jog@xxxxxxxxxxxxxxxxx>>:

> 

>     Dear All,

> 

>     Quick question regarding SSD sizing for a DB/WAL...

> 

>     I understand 4% is generally recommended for a DB/WAL.

> 

>     Does this 4% continue for "large" 12TB drives, or can we  economise and

>     use a smaller DB/WAL?

> 

>     Ideally I'd fit a smaller drive providing a 266GB DB/WAL per 12TB OSD,

>     rather than 480GB. i.e. 2.2% rather than 4%.

> 

>     Will "bad things" happen as the OSD fills with a smaller DB/WAL?

> 

>     By the way the cluster will mainly be providing CephFS, fairly large

>     files, and will use erasure encoding.

> 

>     many thanks for any advice,

> 

>     Jake

> 

> 

>     _______________________________________________

>     ceph-users mailing list

>     ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>

>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

> 

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Regards
Frank Yu

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com