I did that but i am using Ceph-ansible 3.0.8 version which doesn't support auto creation of LVM :( i think 3.1 version has LVM support. Because of some reason i have to stick to 3.0.8 so i need to create manually. On Tue, Jul 24, 2018 at 8:34 AM, Alfredo Deza <adeza@xxxxxxxxxx> wrote: > On Mon, Jul 23, 2018 at 2:33 PM, Satish Patel <satish.txt@xxxxxxxxx> wrote: >> Alfredo, >> >> Thanks, I think i should go with LVM then :) >> >> I have question here, I have 4 physical SSD per server, some reason i >> am using ceph-ansible 3.0.8 version which doesn't create LVM volume >> itself so i have to create LVM volume manually. >> >> I am using bluestore ( want to keep WAL/DB on same DATA disk), How do >> i create lvm manually on single physical disk? Do i need to create two >> logical volume (1 for journal & 1 for Data )? >> >> I am reading this >> http://docs.ceph.com/ceph-ansible/master/osds/scenarios.html (at >> bottom) >> >> lvm_volumes: >> - data: data-lv1 >> data_vg: vg1 >> crush_device_class: foo > > For a raw device (e.g. /dev/sda) you can do: > > lvm_volumes: > - data: /dev/sda > > The LV gets created for you in this one case > >> >> >> In above example, did they create vg1 (volume group) and created >> data-lv1 (logical volume)? If i want to add journal then do i need to >> create one more logical volume? I am confused in that document so >> need some clarification >> >> On Mon, Jul 23, 2018 at 2:06 PM, Alfredo Deza <adeza@xxxxxxxxxx> wrote: >>> On Mon, Jul 23, 2018 at 1:56 PM, Satish Patel <satish.txt@xxxxxxxxx> wrote: >>>> This is great explanation, based on your details look like when reboot >>>> machine (OSD node) it will take longer time to initialize all number >>>> of OSDs but if we use LVM in that case it shorten that time. >>> >>> That is one aspect, yes. Most importantly: all OSDs will consistently >>> come up with ceph-volume. This wasn't the case with ceph-disk and it >>> was impossible to >>> replicate or understand why (hence the 3 hour timeout) >>> >>>> >>>> There is a good chance that LVM impact some performance because of >>>> extra layer, Does anyone has any data which can provide some inside >>>> about good or bad performance. It would be great if your share so it >>>> will help us to understand impact. >>> >>> There isn't performance impact, and if there is, it is negligible. >>> >>>> >>>> >>>> >>>> On Mon, Jul 23, 2018 at 8:37 AM, Alfredo Deza <adeza@xxxxxxxxxx> wrote: >>>>> On Mon, Jul 23, 2018 at 6:09 AM, Nicolas Huillard <nhuillard@xxxxxxxxxxx> wrote: >>>>>> Le dimanche 22 juillet 2018 à 09:51 -0400, Satish Patel a écrit : >>>>>>> I read that post and that's why I open this thread for few more >>>>>>> questions and clearence, >>>>>>> >>>>>>> When you said OSD doesn't come up what actually that means? After >>>>>>> reboot of node or after service restart or installation of new disk? >>>>>>> >>>>>>> You said we are using manual method what is that? >>>>>>> >>>>>>> I'm building new cluster and had zero prior experience so how can I >>>>>>> produce this error to see lvm is really life saving tool here? I'm >>>>>>> sure there are plenty of people using but I didn't find and good >>>>>>> document except that mailing list which raising more questions in my >>>>>>> mind. >>>>>> >>>>>> When I had to change a few drives manually, copying the old contents >>>>>> over, I noticed that the logical volumes are tagged with lots of >>>>>> information related to how they should be handled at boot time by the >>>>>> OSD startup system. >>>>>> These LVM tags are a good standard way to add that meta-data within the >>>>>> volumes themselves. Apparently, there is no other way to add these tags >>>>>> that allow for bluestore/filestore, SATA/SAS/NVMe, whole drive or >>>>>> partition, etc. >>>>>> They are easy to manage and fail-safe in many configurations. >>>>> >>>>> This is spot on. To clarify even further, let me give a brief overview >>>>> of how that worked with ceph-disk and GPT GUID: >>>>> >>>>> * at creation time, ceph-disk would add a GUID to the partitions so >>>>> that it would later be recognized. These GUID were unique so they >>>>> would ensure accuracy >>>>> * a set of udev rules would be in place to detect when these GUID >>>>> would become available in the system >>>>> * at boot time, udev would start detecting devices coming online, and >>>>> the rules would call out to ceph-disk (the executable) >>>>> * the ceph-disk executable would then call out to the ceph-disk >>>>> systemd unit, with a timeout of three hours the device to which it was >>>>> assigned (e.g. ceph-disk@/dev/sda ) >>>>> * the previous step would be done *per device*, waiting for all >>>>> devices associated with the OSD to become available (hence the 3 hour >>>>> timeout) >>>>> * the ceph-disk systemd unit would call back again to the ceph-disk >>>>> command line tool signaling devices are ready (with --sync) >>>>> * the ceph-disk command line tool would call *the ceph-disk command >>>>> line tool again* to "activate" the OSD, having detected (finally) the >>>>> device type (encrypted, partially prepared, etc...) >>>>> >>>>> The above workflow worked for pre-systemd systems, it could've >>>>> probably be streamlined better, but it was what allowed to "discover" >>>>> devices at boot time. The 3 hour timeout was there because >>>>> udev would find these devices being active asynchronously, and >>>>> ceph-disk was trying to coerce a more synchronous behavior to get all >>>>> devices needed. In a dense OSD node, this meant that OSDs >>>>> would not come up at all, inconsistently (sometimes all of them would work!). >>>>> >>>>> Device discovery is a tremendously complicated and difficult problem >>>>> to solve, and we thought that a few simple rules with UDEV would be >>>>> the answer (they weren't). The LVM implementation of ceph-volume >>>>> limits itself to just ask LVM about devices and then gets them >>>>> "activated" at once. On some tests on nodes with ~20 OSDs, we were 10x >>>>> faster to come up (compared to ceph-disk), and fully operational - >>>>> every time. >>>>> >>>>> Since this is a question that keeps coming up, and answers are now >>>>> getting a bit scattered, I'll compound them all into a section in the >>>>> docs. I'll try to address the "layer of complexity", "performance >>>>> overhead", and other >>>>> recurring issues that keep being used. >>>>> >>>>> Any other ideas are welcomed if some of the previously discussed >>>>> things are still not entirely clear. >>>>> >>>>>> >>>>>>> Sent from my iPhone >>>>>>> >>>>>>> > On Jul 22, 2018, at 6:31 AM, Marc Roos <M.Roos@xxxxxxxxxxxxxxxxx> >>>>>>> > wrote: >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > I don’t think it will get any more basic than that. Or maybe this? >>>>>>> > If >>>>>>> > the doctor diagnoses you, you can either accept this, get 2nd >>>>>>> > opinion, >>>>>>> > or study medicine to verify it. >>>>>>> > >>>>>>> > In short lvm has been introduced to solve some issues of related >>>>>>> > to >>>>>>> > starting osd's (which I did not have, probably because of a >>>>>>> > 'manual' >>>>>>> > configuration). And it opens the ability to support (more future) >>>>>>> > devices. >>>>>>> > >>>>>>> > I gave you two links, did you read the whole thread? >>>>>>> > https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg47802.htm >>>>>>> > l >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > -----Original Message----- >>>>>>> > From: Satish Patel [mailto:satish.txt@xxxxxxxxx] >>>>>>> > Sent: zaterdag 21 juli 2018 20:59 >>>>>>> > To: ceph-users >>>>>>> > Subject: Why lvm is recommended method for bleustore >>>>>>> > >>>>>>> > Folks, >>>>>>> > >>>>>>> > I think i am going to boil ocean here, I google a lot about this >>>>>>> > topic >>>>>>> > why lvm is recommended method for bluestore, but didn't find any >>>>>>> > good >>>>>>> > and detail explanation, not even in Ceph official website. >>>>>>> > >>>>>>> > Can someone explain here in basic language because i am no way >>>>>>> > expert so >>>>>>> > just want to understand what is the advantage of adding extra layer >>>>>>> > of >>>>>>> > complexity? >>>>>>> > >>>>>>> > I found this post but its not i got lost reading it and want to see >>>>>>> > what >>>>>>> > other folks suggesting and offering in their language >>>>>>> > https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg46768.htm >>>>>>> > l >>>>>>> > >>>>>>> > ~S >>>>>>> > _______________________________________________ >>>>>>> > ceph-users mailing list >>>>>>> > ceph-users@xxxxxxxxxxxxxx >>>>>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>>> > >>>>>>> > >>>>>>> >>>>>>> _______________________________________________ >>>>>>> ceph-users mailing list >>>>>>> ceph-users@xxxxxxxxxxxxxx >>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>> -- >>>>>> Nicolas Huillard >>>>>> _______________________________________________ >>>>>> ceph-users mailing list >>>>>> ceph-users@xxxxxxxxxxxxxx >>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com