Hello, # dropped an address from Cc as it's not reachable. On Wed, 01 Apr 2020 05:16:58 +0900, Johannes Berg wrote: > > > For the step 1, we put LKL as one of UMMODE in order to make less effort to > > > integrate (make ARCH=um UMMODE=library). The modification to existing UML > > > code is trying to be minimized. > > > The current step (1 in the milestone) tries to cover this goal: > > splitting ARCH=um into UMMODE_KERNEL and UMMODE_LIB. > > So maybe we're doing this backwards? > > I mean ... you're trying to minimize the UML code changes, while I'm > sort of arguing for maximizing them, to achieve a cleaner split. I see the point. > In a sense, I think if this is to happen, then we're in it for the long > haul. Meaning that we don't actually need all of this working > immediately. > > So I think conceptually we should answer the questions that I raised > below first (basically a kind of "can it be done?" question), and then > work towards that goal? IMHO. okay, agree. > > > > 1) We give up, and get ARCH=lkl, sharing just (some of) the drivers > > > > while moving them into the right drivers/somewhere/ place. Even that > > > > looks somewhat awkward looking at the later patches in this set, but > > > > seems like that at *least* should be done. > > > > > > Yeah, this would be a goal. > > > UML and LKL are quite different but they should share at least their userspace > > > drivers. > > > I also don't mind if we don't share every driver at the beginning but > > > it should be > > > a feasible goal for the future. > > > > Sharing drivers code is also included in this patchset, step 2 in the > > milestone. > > > > I was thinking that implementing os_*() functions with lkl_host_ops > > would be the further goal (e.g., step 3 or 4). > > Personally, I think this is backwards. That step is the actually > *interesting* part, because if this turns out not to be possible, then > we should pick option (1) instead of trying to do option (2), failing, > and leaving the code a mess (at least personally I think that after this > patchset, the code is kinda a mess with all the ifdefs, duplication, > etc.) Yes, I know you're in this for the long haul, but still - it'd be > a shame to have to do that. > > So in a sense, I myself would actually prefer to have an LKL _without_ > drivers, but integrated well with UML, over the one that you have now. LKL without drivers might be nothing, but let's see if this will end a clean, and minimal viable patchset. > > > > 2) Ideally, instead, we actually unify: LKL grows support for userspace > > > > processes using UML infrastructure, the "in-kernel" IRQ mechanisms > > > > are unified, UML stuff moves into lkl-ops, and the UML binary > > > > basically becomes a user of the LKL library to start everything up. > > > > There may be some bits remaining that are just not interesting (e.g. > > > > some drivers you don't care about would continue to make direct calls > > > > to the user side instead of lkl-ops, and then they're just not > > > > compatible with lkl, only with the uml special case of lkl), but then > > > > things are clean. > > > > > > A few months ago I though this is doable but now I'm not so sure anymore. > > > > For the part of (2) which Johannes pointed out (I mean the part "UML > > stuff moves into lkl-ops"), I become to think that implementing os_*() > > functions using lkl_host_ops would be also interesting if those > > re-implementation makes the glue code eliminated. > > > > I'll work on that. > > Don't go too fast :-) > > I really think that this only makes sense if we can also share much of > the other code, e.g. the interrupt processing, thread model, etc. If we > just share the lkl ops underneath, and then end up implementing two IRQ > models and all on top of those, IMHO we've won nothing. > > So I (at least) really see it as a choice between these two options: > > 1) add LKL as arch/lkl/ and share the drivers, but not the arch code > > 2) really unify LKL and UML, and have them share the arch code, and make > UML a special case of LKL, but not in the sense that it has vastly > different arch code (like special interrupt handling, etc.) 1) is not an option; we discussed this before not to have similar archs in different directories. # Richard, correct me if I'm wrong. 2) may be doable, the followings would be the list of things what we need: - unify the driver codes - unify interrupt handling, need to be x86/linux-independent - unify thread model (i.e., struct thread_info) - unify code scheduling (e.g., __switch_to()) - unify memory management (mmu v.s. nommu) - unify host interface (os_*() v.s. lkl_host_ops) - support multiple syscall handlings (ptrace-based interception, direct func call, etc) - (may still miss something..) Those are actually the list of differences between UML and LKL. And each item is inter-related: interrupt handling depends on how the code/thread is executed, thread implementation interacts with memory management, etc. Thus for instance, unifying IRQ mechanism may involve several points of the above list to be reorganized/reimplemented UML code to support the library mode. I thought this makes broader changes to UML, which I was trying to avoid in v4 patch. > Now, you allude to the fact that UML is pretty much x86 only, and > perhaps that's a point where we can do (2) but only support userspace > programs on x86, or such? x86/linux only (LKL also has x86/win32 support). > I don't know where the host architecture > actually comes in much in UML, and where that may or may not be the case > in LKL. > > > For the other part of (2), I agree that your definition of the > > unification will be the best and final goal of this integration. > > Fair, but the problem is that we have to decide *now* whether it's > actually possible or not. If not, then IMHO it's a bad choice to even > put LKL under arch/um/. > > > But, especially the support for UML userspace processes in LKL is not > > easy as far as I can see > > OK, I'll bite - why not? I mean, what's different enough in LKL and UML > to make this "not easy"? > > I'm not trying to paint you into a corner here with that, I'm just > trying to understand the innards of LKL better. I have a _bit_ of a > grasp of the UML internals by now, but of course not LKL. > > So where do they differ? Conceptually, they seem very similar, but the > details actually are different. > The differences (and also conflicts) between UML and LKL are pretty much the unification list of above 2). The current LKL design assumes that there is only a single process/application (but can have multiple threads) in a single LKL kernel instance. This assumption makes the design simple in several places of the kernel: - no memory protection across multiple users/processes needed - no address space separation between user- and kernel-code as a result, userspace code can directly call syscalls as function calls. This part is needed to expand more to support multiple processes running/spawned (as UML does), which I mentioned the current LKL doesn't have. > But I have the same question on e.g. the IRQ model. I mean, OK, I > understand that LKL started from a different base and all, but is it > actually *necessary* for LKL to have a different IRQ model? Or is that > "just" intertia? As for the interrupt handling, the currently LKL's interrupt is triggered by the function call (lkl_trigger_irq or equivalent) while UML's one is triggered by fd notification (epoll). # LKL actually uses fd events but encapsulates into the lkl_host_ops. So now we have three options for interrupt handling: - use UML IRQ model in LKL (and UML) - use LKL IRQ model in UML (and LKL) - use two IRQ models (the current v4 patch) I gave up to take the 1st approach as it drops host-independent characteristics of LKL, thus took 3rd approach. So to answer your question, yes, it was needed to have a different IRQ model. But maybe I should try the 2nd approach as well to avoid duplication. I will conduct another experiment if this sounds the right direction. > > Or the title of the cover letter is somehow overstatement: instead, > > "Minimum integration of UML with LKL" or something like this would be > > better ? > > Heh, well, doesn't really matter? > > But again, there are a few different aspects here: > - what's technically feasible > - what this patchset achieves > - where we want to be in the end > > I think right now these are diverging enough that we can't even answer > that last question, much less find the road to get there. I think the option 2) is the final goal. > > Since the patchset of LKL is relatively huge, I was trying to make a > > minimum patchset for the first step. Because having #ifdefs and glue > > code makes existing arch/um code untouched, I took this way to make > > the patchset as small as possible. > > > > But if this is not the case, I will look for the other way and touch > > arch/um code to fit the current LKL implementation. > > > > What do you think ? > > I think that'd be fine, if indeed that's what we want to do. > > I really think we're beating around the bush, and need to first figure > out the technical differences between UML and LKL and decide between the > options (1) and (2) above. Maybe there's a compromise there somewhere, > where some small bits of code still _are_ different, but IMHO having two > (IRQ, thread, memory) models, two host interfaces (lkl-ops vs. os_* > functions), even two include/asm/ source trees (and so on) is not > appropriate. > > This may take some patches, and some experimentation. I'd leave drivers > out of this initially - you should be able to test LKL with something > simpler, right? The API surface is basically the syscall interface as > functions, so you can start the library and call something simple > initially? Though I guess you need some driver for the IRQ model to make > sense, etc. okay. > And like I said before, that decision will frame everything else. I > really don't think we can make significant progress here without having > decided whether this is possible. > > Perhaps UML *can* become a "special case" of LKL, with a special API > function (that's not part of the syscall surface) to "boot(cmdline)" or > something. But if it can't, and has to remain as separated as the two > are today, I would argue we're better off just not calling them the same > architecture. I agree with this if the unification has all completed. Thanks, -- Hajime