On Fri, Dec 6, 2019 at 8:31 AM Sage Weil <sage@xxxxxxxxxxxx> wrote: > > My thoughts here are still pretty inconclusive... > > I agree that we should invest a non-LVM mode, but there isn't a way to do > that currently that supports dm-crypt that isn't complicated and > convoluted, so it cannot be a full replacement for the LVM mode. The `ceph-volume simple` sub-command does allow dmcrypt. The key is stored in the JSON file in /etc/ceph/osd. Is there a scenario you've seen where this is not possible? The `simple` sub-command would even allow partitions (regardless of ceph-disk). > > At the same time, Real Soon Now we're going to be building crimson OSDs > backed by ZNS SSDs (and eventually persistent memory), which will also > very clearly not be LVM-based. I'm a bit hesitant to introduce a > bare-bones bluestore mode right now just because we'll be adding yet > another variation soon, and it may be that we construct a general approach > to both... but probably not. And the whole point of c-v's architecture > was to be pluggable. > > So maybe a bare-bones bluestore mode makes sense. In the simple case, it > really should be *very* simple. But its scope pretty quickly expodes: > what about wal and db devices? We have labels for those, so we could > support those, also easily... if the user has to partition the devices > beforehand manually. They'll immediately want to use the new > auto/batch thing, but that's tied to the LVM implementation. And what > if one of the db/wal/main devices is an LV and another is not? We'd > need to make sure the lvm mode machinery doesn't trigger unless all of > its labels are there, but it might be confusing. All of which means that > this is probably *only* useful for single-device OSDs. On the one hand, > those are increasingly common (hello, all-SSD clusters), but on the other > hand, for fast SSDs we may want to deploy N of them per device. > > Since we can't cover all of that, and at a minimum, we can't cover > dm-crypt, Rook will need to behave with the lvm mode one way or another. > So we need to have a wrapper (or something similar) no matter what. So I > suggest we start there. > > sage > > > On Fri, 6 Dec 2019, Sebastien Han wrote: > > > Hi Kai, > > > > Thanks! > > ––––––––– > > Sébastien Han > > Senior Principal Software Engineer, Storage Architect > > > > "Always give 100%. Unless you're giving blood." > > > > On Fri, Dec 6, 2019 at 10:44 AM Kai Wagner <kwagner@xxxxxxxx> wrote: > > > > > > Hi Sebastien and thanks for your feedback. > > > > > > On 06.12.19 10:00, Sebastien Han wrote: > > > > ceph-volume is a sunk cost! > > > > And your argument basically falls into that paradigm, "oh we have > > > > invested so much already, that we cannot stop and we should continue > > > > even though this will only bring more trouble". Incapable of accepting > > > > this sunk cost. > > > > All the issues that have been fixed with a lot of pain. > > > > All that pain could have been avoided if LVM wasn't there and pursuing > > > > in that direction will only lead us to more pain again. > > > > > > The reason I disagree here is the scenario were the WAL/DB is on a > > > separate device and a single OSD crashes. In that case you would like to > > > recreate just that single OSD instead of the whole group. Also if we > > > deprecate a tool such like we did with ceph-disk, users have to migrate > > > sooner or later if they don't want to do everything manually on the CLI > > > (by that I mean via fdisk/pure lvm commands and so on). > > > > > > We could argue now that this can still be done on the command line > > > manually but all our efforts are towards simplicity/automation and > > > having everything in the Dashboard. If the underlying tool/functionality > > > isn't there anymore, that isn't possible. > > > > > > > I understand your position, yes when we start separating block/db/wal > > things get really complex that's why I'm sticking with block/db/wal in > > the same block. > > Also, we haven't seen any request for separating those when running > > OSDs on PVC in the Cloud. So we would likely continue to do so for a > > while. > > > > > > Also, I'm not saying we should replace the tool but allow not using > > > > LVM for a simple scenario to start with > > > > > > Which then leads me to, why couldn't such functionality be implemented > > > into a single tool instead of having two at the end? > > > > > > So don't get me wrong, I'm not saying that I'm against everything I'm > > > just saying that I think this is a topic that should be discussed in > > > more depth. > > > > Yes, that's for sure. > > > > > > > > As said, just my two cents here. > > > > > > Kai > > > > > > -- > > > SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg > > > GF:Geschäftsführer: Felix Imendörffer, (HRB 36809, AG Nürnberg) > > > > > > > > _______________________________________________ > > Dev mailing list -- dev@xxxxxxx > > To unsubscribe send an email to dev-leave@xxxxxxx > > _______________________________________________ > Dev mailing list -- dev@xxxxxxx > To unsubscribe send an email to dev-leave@xxxxxxx _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx