On Fri, Dec 06, 2019 at 01:31:05PM +0000, Sage Weil wrote: >My thoughts here are still pretty inconclusive... > >I agree that we should invest a non-LVM mode, but there isn't a way to do >that currently that supports dm-crypt that isn't complicated and >convoluted, so it cannot be a full replacement for the LVM mode. > >At the same time, Real Soon Now we're going to be building crimson OSDs >backed by ZNS SSDs (and eventually persistent memory), which will also >very clearly not be LVM-based. I'm a bit hesitant to introduce a >bare-bones bluestore mode right now just because we'll be adding yet >another variation soon, and it may be that we construct a general approach >to both... but probably not. And the whole point of c-v's architecture >was to be pluggable. > >So maybe a bare-bones bluestore mode makes sense. In the simple case, it >really should be *very* simple. But its scope pretty quickly expodes: >what about wal and db devices? We have labels for those, so we could >support those, also easily... if the user has to partition the devices >beforehand manually. They'll immediately want to use the new >auto/batch thing, but that's tied to the LVM implementation. And what >if one of the db/wal/main devices is an LV and another is not? We'd >need to make sure the lvm mode machinery doesn't trigger unless all of >its labels are there, but it might be confusing. All of which means that >this is probably *only* useful for single-device OSDs. On the one hand, >those are increasingly common (hello, all-SSD clusters), but on the other >hand, for fast SSDs we may want to deploy N of them per device. I don't think keeping a simple or barebones approach will survive contact with real-world deployments. Imho if we want a raw mode, we better be prepared to deal with multi-device OSDs and multi-OSD devices and the partitioning this requires. > >Since we can't cover all of that, and at a minimum, we can't cover >dm-crypt, Rook will need to behave with the lvm mode one way or another. >So we need to have a wrapper (or something similar) no matter what. So I >suggest we start there. Agreed. > >sage > > >On Fri, 6 Dec 2019, Sebastien Han wrote: > >> Hi Kai, >> >> Thanks! >> ––––––––– >> Sébastien Han >> Senior Principal Software Engineer, Storage Architect >> >> "Always give 100%. Unless you're giving blood." >> >> On Fri, Dec 6, 2019 at 10:44 AM Kai Wagner <kwagner@xxxxxxxx> wrote: >> > >> > Hi Sebastien and thanks for your feedback. >> > >> > On 06.12.19 10:00, Sebastien Han wrote: >> > > ceph-volume is a sunk cost! >> > > And your argument basically falls into that paradigm, "oh we have >> > > invested so much already, that we cannot stop and we should continue >> > > even though this will only bring more trouble". Incapable of accepting >> > > this sunk cost. >> > > All the issues that have been fixed with a lot of pain. >> > > All that pain could have been avoided if LVM wasn't there and pursuing >> > > in that direction will only lead us to more pain again. >> > >> > The reason I disagree here is the scenario were the WAL/DB is on a >> > separate device and a single OSD crashes. In that case you would like to >> > recreate just that single OSD instead of the whole group. Also if we >> > deprecate a tool such like we did with ceph-disk, users have to migrate >> > sooner or later if they don't want to do everything manually on the CLI >> > (by that I mean via fdisk/pure lvm commands and so on). >> > >> > We could argue now that this can still be done on the command line >> > manually but all our efforts are towards simplicity/automation and >> > having everything in the Dashboard. If the underlying tool/functionality >> > isn't there anymore, that isn't possible. >> > >> >> I understand your position, yes when we start separating block/db/wal >> things get really complex that's why I'm sticking with block/db/wal in >> the same block. >> Also, we haven't seen any request for separating those when running >> OSDs on PVC in the Cloud. So we would likely continue to do so for a >> while. >> >> > > Also, I'm not saying we should replace the tool but allow not using >> > > LVM for a simple scenario to start with >> > >> > Which then leads me to, why couldn't such functionality be implemented >> > into a single tool instead of having two at the end? >> > >> > So don't get me wrong, I'm not saying that I'm against everything I'm >> > just saying that I think this is a topic that should be discussed in >> > more depth. >> >> Yes, that's for sure. >> >> > >> > As said, just my two cents here. >> > >> > Kai >> > >> > -- >> > SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg >> > GF:Geschäftsführer: Felix Imendörffer, (HRB 36809, AG Nürnberg) >> > >> > >> _______________________________________________ >> Dev mailing list -- dev@xxxxxxx >> To unsubscribe send an email to dev-leave@xxxxxxx >> >_______________________________________________ >Dev mailing list -- dev@xxxxxxx >To unsubscribe send an email to dev-leave@xxxxxxx -- Jan Fajerski Senior Software Engineer Enterprise Storage SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Felix Imendörffer _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx