On 3 Aug 2011, at 21:48, Pekka Enberg wrote: > Your description doesn't really explain what you want to use this > thing exactly for in userspace. A typical use might be communicating between components in a set-top-box (STB). This might involve: * Some sort of GUI user interface (e.g., a browser). This will send control messages and receive state messages. * Some sort of IR input, reading keypresses from a remote control. The program reading the keypresses will decide to send control messages for some of them. * Possibly input from a mobile phone (over bluetooth or whatever), acting as another source of control. It's possible messages may also be received that require sending information back to the phone. * A process reading data streams from the network and passing the appropriate parts therefrom to audio and video decoders. This will receive messages to tell it which programs to play, and send messages indicating what it is doing. * Another process recording programs to disk, as directed by the user inputs. It may need to send messages to the process reading data streams. It will also send messages of interest to the GUI. * A process playing programs back from disk, including "trick play" - that is, fast forward, skip and reverse. Obviously it receives messages telling it which program to play, and what trick play operations to perform. It in turn will send messages to the UI to say what it is doing. Having the listener choose what it wants to listen to is a clear win in these circumstances - it means that the sender of a message does not need to know if a new piece of infrastructure is added that also wants to receive it. Similarly, allowing any sender to send a particular request also makes sense, as several processes might want to ask the current location of play in the displayed video stream, or to request some sort of trick play action. (I'm sure all of this could be done perfectly well with, for instance, DBus as well, but I hope I've adequately explained elsewhere why that's not an applicable solution.) A small example might be several programs waiting for particular conditions to be satisfied, and sending messages to a central program which lights up LEDs according to the messages it reveives. Real examples of usage that aren't the STB are a bit difficult to give because they belong to customer projects that we're not allowed to talk about. > On Fri, Jul 29, 2011 at 12:48 AM, Tony Ibbs <tibs@xxxxxxxxxxxxxx> wrote: > > So why did we write it as a kernel module? > > ========================================== > > As implementors, a kernel module makes a lot of sense. Not least > > because: > > > > * It gives us a lot of things for free, including list handling, > > reference counting, thread safety and (on larger systems) > > multi-processor support, which we would otherwise have to write and > > debug ourselves. This also keeps our codebase smaller. > > That's not a reason to put this into the kernel, really. It's part of the reason why we wrote KBUS as a kernel module, which is what this section was about. Agreed, it's not a reason that one can readily use to argue that "X" (whatever that may be) should go in the kernel-as-distributed, or we'd have all of user space there, which would no longer be Linux (not sure what it *would* be). > > * It helps give us reliability, partly because of the code we're > > relying on, partly because of the strictures of working in the > > kernel, partly by shielding us from userspace. > > So now instead of crashing in userspace, we crash the kernel? This > seems like a bogus argument as well. Well, ignoring the tone of that comment, the same argument as above applies. Although I would point out that what I was saying was that it would be intrinsically much less likely to crash anywhere because it is a kernel module. > > * It reduces message copying (we have userspace to kernel back to > > userspace, as opposed to a userspace daemon communicating with > > clients via sockets) > > Now this sounds like a real reason but you'd have to explain why you > can't reuse existing zero-copy mechanisms like splice() and tee(). Hmm. vmsplice() too, presumably. I'll freely admit I don't know anything beyond what I've just read about these functions. If one was writing KBUS from scratch as a userspace library, with associated daemon, then they might well be useful, but one would need to think their use through very carefully, and I don't believe the code would be simple (the image I have in mind is managing message structures with two-metre long tongs, through an air-water boundary). > > * It makes it simple for us to tell when a message recipient has "gone > > away", as the kernel will call our "release" callback for us. > > Again, sounds like a reasonable technical requirement but doesn't > really justify putting all this code into the kernel. I'll get back to that below. > > * It allows us to provide the functionality on systems without > > requiring anything much beyond /dev and maybe /proc in userspace. > > Why is this important? Because we sometimes want to target systems that do not need a userspace filesystem, either because they are very simple (so their needs can be satisfied by starting the necessary programs up in init), or because they're trying to save space, or because they don't have any physical storage associated with them, etc. I assume the real point of your post is that I wrote about the reasons why we made KBUS a kernel module, but did not really address the reasons why KBUS might want to be a kernel module in general usage. Obviously, there's one overriding reason, which is key: * Inter-process messaging is hard to get right, and very easy to get wrong. The kernel provides low-level mechanisms one can use to write a userspace inter-process messaging system, but not an actual solution. Our contention is that a simple inter-process messaging module is a worthwhile addition to the toolkit supplied by the kernel. The trick is not to get over-ambitious (clearly enterprise solutions like DBus belong in userspace), but to provide a sensible mimumum. KBUS is our attempt at this, based on our experience of what one actually needs in a relatively simple system. Clearly, as the needs of a system grow, there is likely to be a point at which larger, more powerful solutions may be necessary (inevitably if you need things KBUS doesn't provide), but that shouldn't preclude providing the simpler solution. Otherwise, I'll try to give some subsidiary reasons below, but I'm bound to have forgotten something. The points aren't in any particular order. * I aleady said that it is important that the kernel has a single point where it knows that a process has gone away. Knowing this is a fundamental requirement of KBUS, and it would be difficult and unreliable to do in userspace. I actually think this is a very important point, as it is at the core of how KBUS works. * All the queues are in one place. If KBUS was a userspace daemon, then it has to maintain the same queues as it does now (in order to get the same effect), plus some fraction of N message copies in transit through the kernel, where N is the number of clients sending/receiving messages at a particular time. With KBUS in the kernel, that "fraction of N" is not needed, and thus KBUS can account much more accurately for the memory it is using. This in turn means that it can be less conservative about the amount of memory available for its queues, meaning it can have more messages in transit. (Note that KBUS at the moment is nowhere near as good at this as it should be, but resource management is acknowledged to be a problem that we need to address, and it would be very simple to have a memory limit per bus.) Again, it's not that one can't do something similar in userspace, but that doing it in userspace is both more complicated and more wasteful. * On embedded systems with not much memory, the OOM killer can be quite active in userspace. If the message system is crucial, then it is a big advantage having it in the kernel, where it cannot be killed (that's not to claim that KBUS as it stands is well suited to this use case, but it is more suitable than if it were a userspace daemon). (I do realise that there are ways of overriding the OOM killer per process, but being removed from the problem seems more sensible.) * KBUS works in each client's priority, and thus avoids priority inversion problems, compared to userspace daemons. A userspace daemon must run at its own particuar priority. If it is high, then a low priority program sending messages can starve a higher process program, and if it is low, the low prioriy processes can preempt higher priority processes. * Userspace peer-to-peer messaging via sockets (for instance) needs a persistent store of client identities ("names"). Writing this so that race conditions are minimised is not simple, and doing so makes the whole messaging infrastructure more complex. I hope the example at the beginning of this email makes it clearer why we'd rather not have such. * It was mentioned before that KBUS being a kernel module makes it significantly smaller, as it can leverage code that is already present in the kernel. This can be important on embedded systems, since NAND flash is slow, and loading an extra few MB of library can slow the boot process down unacceptably. This matters to us quite a lot, it may matter less to the general kernel community... * Despite having said that we weren't aiming for the sort of security handling that DBus provides, some security considerations are of interest. In particular, being a kernel module means that KBUS definitatively knows the identity of the sender and recipient(s) of each message. This makes it possible, for instance, for a sender of a request to assert that it should only succeed (at "write" time) if the intended recipient is that expected (so if the original recipient unbinds and a new recipient rebinds, this can be trivially detected - we use this so a sender can realise that the replier has changed and will not have any required state). * Coming back to the "being in the kernel means more code reuse" issue, this is not insignificant. If your message manager crashes, for whatever reason, you will typically have lost all the in-transit messages. This is a fairly serious issue. Reusing lots of well tested code, and having to adhere to a moderately rigourous coding style and set of practices helps a lot. It's not enough by itself to justify being in the kernel, but it should not be ignored as a contributory factor once one is balancing issues. * Being in the kernel means that it should be a lot easier to scale to multiple processors. And other forms of scaling that the kernel does for you (more or less). * I've recently received a specific request for support of messaging between kernel and userspace (and vice-versa). I've yet to look at the feasibility of this (it's my next job after this email), but I think it's a fairly simple and non-obscure set of changes to KBUS. I don't believe this would be as true of a userspace system. This would allow us to replace writing to a user process that exists merely to write to a (locally written) driver for a piece of hardware with direct communication with that driver. Tibs -- To unsubscribe from this list: send the line "unsubscribe linux-embedded" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html