Re: DB sizing for lots of large files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 11/26/20 12:45 PM, Richard Thornton wrote:
Hi,

Sorry to bother you all.

It’s a home server setup.

Three nodes (ODROID-H2+ with 32GB RAM and dual 2.5Gbit NICs), two 14TB
7200rpm SATA drives and an Optane 118GB NVMe in each node (OS boots from
eMMC).


*snipsnap*

Is there a rough CephFS calculation (each file uses x bytes of metadata), I
think I should be safe with 30GB, now I read I should double that (you
should allocate twice the size of the biggest layer to allow for
compaction) but I only have 118GB and two OSDs so I will have to go for
59GB (or whatever will fit)?

The recommended size of 30 GB is due to the level design of rocksdb; data is stored in different cache levels with increasing level sizes. 30 GB is a kind of sweet spot between 3 GB and 300 GB (too small / way too large for most use case). The recommendation for doubling the size for compaction is OK, but you will waste capacity most the time.

In our cephfs instance we have ~ 115.000.000 files. Metadata is stored on 18 SSD based OSDs. About 30-35 GB raw capacity of the data is currently in use, almost exclusively for metadata, omap and other stuff. You might be able to scale this down to your use case. Our average file size approx. 5 MB, so you can also put a little bit on top in your case.

If your working set (files accesses in a time span) is rather small, you also have the option to use the SSD for some block device caching layer like bcache or dmcache. In this setup the whole capacity will be used, and also data operations on the OSDs will benefit from the faster SSDs. Your failure domain will be the same; if the SSD dies your data disks will be useless.

Otherwise I would recommend to use DB partitions of the recommended size (do not forget to include some extra space for the WAL), and use the remaining capacity for extra SSD based OSDs similar to our setup. This willensure that metadata access will be fast[tm].


Regards,

Burkhard

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux