I think there are a couple of reasons for LVM OSDs: - bluestore cannot handle multi-path devices, you need LVM here - the OSD meta-data does not require a separate partition - it is easy to provision 2 or more OSDs per disk - LVM's dm_cache is an alternative to separate block/db devices with the features that it can be dynamically re-sized at run-time and also allows to deviate from the 3/30/300 without wasting fast storage capacity; for example, we plan to have 1TB dm_cache per spinning disk on NVMe in the future; this would not only fit WAL/DB, it would also cache hot data; in addition one can configure it not to promote on first hit to prevent cache wiping by backup software I find it much easier to administrate LVM OSDs, I'm also using customized scripts and the ceph/volume lvm command suite simplifies things a lot. Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Marc <Marc@xxxxxxxxxxxxxxxxx> Sent: 19 March 2021 20:17:28 To: Nico Schottelius; ceph-users Subject: Re: LVM vs. direct disk acess I have asked exactly the same question 1 year ago or so. Sage told me to show evidence of a significant impact, because they did not measured one. If I remember correctly is the idea behind this that not all storage devices are available as /dev/sdX as normal disk and lvm sort of solves this problem. (Maybe related to some nvme devices) Maybe you can search for it in the mailing list ;) > -----Original Message----- > From: Nico Schottelius <nico.schottelius@xxxxxxxxxxx> > Sent: 19 March 2021 20:12 > To: ceph-users <ceph-users@xxxxxxx> > Subject: LVM vs. direct disk acess > > > Good evening, > > I've seen the shift in ceph to focus more on LVM than on plain (direct) > access to disks. I was wondering what the motivation is for that. > > From my point of view OSD disk layouts never change (they are re-added > if they do), so the dynamic approach of LVM is probably not the > motivation. > > LVM also adds another layer of indirection and it seems it would be of > disadvantage performance wise as well as added complexity for > management. The former is probably only a minor degradation, the latter > is something I see more as obstacle for maintenance. > > At ungleich we are using a custom script [0] to format a disk with two > partitions, one for the metadata, one for the rest, which seems to be > more simple. > > I assume there are good reasons not to do as we do, but I was wondering > what the practical reasons actually are. > > Best regards, > > Nico > > [0] https://code.ungleich.ch/ungleich-public/ungleich-tools/- > /blob/master/ceph/ceph-osd-create-start > > -- > Sustainable and modern Infrastructures by ungleich.ch > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx