* Bluestore. It's so much better than Filestore. The latest versions add some more control over the memory usage with the cache autotuning, check out the latest Luminous release notes * ~15 HDDs per SSD is usually too much. note that you will lose all the HDDs if the SSD dies, an OSD without its block.db is useless and must be re-created from scratch. Depending on the type of the SSD it might also be overloaded. It's not a big deal if its too small, exact usage depends on object size and omap usage; ~1% is fine if you have ~4 MB objects and basically no omap. If it's too small then it will just spillover to the slow OSD disk for cold data * using LVM cache is possible, same caveat applies about losing the disk * LVM. There are a few older discussions here on the mailing list about whether this is considered "overhead". It's a little bit more annoying to handle from an operations perspective but still better than the old ceph-disk+udev issues during startup. Paul Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 Am Mi., 5. Dez. 2018 um 09:28 Uhr schrieb Jan Kasprzak <kas@xxxxxxxxxx>: > > Hello, CEPH users, > > having upgraded my CEPH cluster to Luminous, I plan to add new OSD hosts, > and I am looking for setup recommendations. > > Intended usage: > > - small-ish pool (tens of TB) for RBD volumes used by QEMU > - large pool for object-based cold (or not-so-hot :-) data, > write-once read-many access pattern, average object size > 10s or 100s of MBs, probably custom programmed on top of > libradosstriper. > > Hardware: > > The new OSD hosts have ~30 HDDs 12 TB each, and two 960 GB SSDs. > There is a small RAID-1 root and RAID-1 swap volume spanning both SSDs, > leaving about 900 GB free on each SSD. > The OSD hosts have two CPU sockets (32 cores including SMT), 128 GB RAM. > > My questions: > > - Filestore or Bluestore? -> probably the later, but I am also considering > using the OSD hosts for QEMU-based VMs which are not performance > critical, and then having the kernel balance the memory usage > between ceph-osd and qemu processes (using Filestore) would > probably be better? Am I right? > > - block.db on SSDs? The docs recommend about 4 % of the data size > for block.db, but my SSDs are only 0.6 % of total storage size. > > - or would it be better to leave SSD caching on the OS and use LVMcache > or something? > > - LVM or simple volumes? I find it a bit strange and bloated to create > 32 VGs, each VG for a single HDD or SSD, and have 30 VGs with only > one LV. Could I use /dev/disk/by-id/wwn-0x5000.... symlinks to have > stable device names instead, and have only two VGs for two SSDs? > > Thanks for any recommendations. > > -Yenya > > -- > | Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> | > | http://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5 | > This is the world we live in: the way to deal with computers is to google > the symptoms, and hope that you don't have to watch a video. --P. Zaitcev > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com