Re: Mixed SSD+HDD OSD setup recommendation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Bluestore. It's so much better than Filestore. The latest versions
add some more control over the memory usage with the cache autotuning,
check out the latest Luminous release notes

* ~15 HDDs per SSD is usually too much. note that you will lose all
the HDDs if the SSD dies, an OSD without its block.db is useless and
must be re-created from scratch. Depending on the type of the SSD it
might also be overloaded.
It's not a big deal if its too small, exact usage depends on object
size and omap usage; ~1% is fine if you have ~4 MB objects and
basically no omap.
If it's too small then it will just spillover to the slow OSD disk for cold data

* using LVM cache is possible, same caveat applies about losing the disk

* LVM. There are a few older discussions here on the mailing list
about whether this is considered "overhead".
It's a little bit more annoying to handle from an operations
perspective but still better than the old ceph-disk+udev issues during
startup.


Paul

Paul

--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

Am Mi., 5. Dez. 2018 um 09:28 Uhr schrieb Jan Kasprzak <kas@xxxxxxxxxx>:
>
>         Hello, CEPH users,
>
> having upgraded my CEPH cluster to Luminous, I plan to add new OSD hosts,
> and I am looking for setup recommendations.
>
> Intended usage:
>
> - small-ish pool (tens of TB) for RBD volumes used by QEMU
> - large pool for object-based cold (or not-so-hot :-) data,
>         write-once read-many access pattern, average object size
>         10s or 100s of MBs, probably custom programmed on top of
>         libradosstriper.
>
> Hardware:
>
> The new OSD hosts have ~30 HDDs 12 TB each, and two 960 GB SSDs.
> There is a small RAID-1 root and RAID-1 swap volume spanning both SSDs,
> leaving about 900 GB free on each SSD.
> The OSD hosts have two CPU sockets (32 cores including SMT), 128 GB RAM.
>
> My questions:
>
> - Filestore or Bluestore? -> probably the later, but I am also considering
>         using the OSD hosts for QEMU-based VMs which are not performance
>         critical, and then having the kernel balance the memory usage
>         between ceph-osd and qemu processes (using Filestore) would
>         probably be better? Am I right?
>
> - block.db on SSDs? The docs recommend about 4 % of the data size
>         for block.db, but my SSDs are only 0.6 % of total storage size.
>
> - or would it be better to leave SSD caching on the OS and use LVMcache
>         or something?
>
> - LVM or simple volumes? I find it a bit strange and bloated to create
>         32 VGs, each VG for a single HDD or SSD, and have 30 VGs with only
>         one LV. Could I use /dev/disk/by-id/wwn-0x5000.... symlinks to have
>         stable device names instead, and have only two VGs for two SSDs?
>
> Thanks for any recommendations.
>
> -Yenya
>
> --
> | Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
> | http://www.fi.muni.cz/~kas/                         GPG: 4096R/A45477D5 |
>  This is the world we live in: the way to deal with computers is to google
>  the symptoms, and hope that you don't have to watch a video. --P. Zaitcev
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux