Hi On Thu, Feb 5, 2015 at 12:03 AM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: > I see "latencies" of around 20 microseconds with lockdep and context > tracking off. For example: Without metadata nor memfd transmission, I get 2.5us for kdbus, 1.5us for UDS (8k payload). With 8-byte payloads, I get 2.2us and 1.2us. I suspect you enabled metadata transmission, which I think is not a fair comparison. A few notes on that: * kdbus is a bus layer. We don't intend to replace UDS, but improve dbus. Comparing roundtrip times with UDS is tempting, but in no way fair. To the very least, a bus layer has to perform peer-lookup, which UDS does not have to do. Imo, 2.5us vs. 1.5us is already pretty nice. Compare this to ~77us for dbus1 without marshaling. * We have not optimized kdbus code-paths for speed, yet. Our main concerns are algorithmic challenges, and we believe they've been improved considerably with kdbus. I have constantly measured kdbus performance with 'perf' and flame-graphs, and there're a lot of possible optimizations (especially on locking). However, I think this can be done afterwards just fine. Neither API nor ioctl overhead has shown up in my measurements. If anyone has counter evidence, please let us know. But I'm a bit reluctant to change our API solely based on performance guesses. * We're about 50% slower than UDS on 1-byte transmissions. With 32k we're on-par. How can a lightweight user-space daemon even get close to that? * Broadcast performance is a completely different story. SEND gets around 30% faster compared to kdbus unicasts (as most of the control-paths are only taken once per message, instead of once per destination). * test-benchmark.c does performance tests in a single process. If the bus-layer is implemented in user-space, you need to account for context-switches and task wakeups. My UDS and pipe round-trip latency tests got around 3x slower if done cross processes (3.7us instead of 1.2us). With a user-space daemon, those slow-downs are taken two times more often for each roundtrip. * Process time is accounted on the sender, instead of a shared process (dbus-daemon). Broadcasts will thus no longer consume time-slices of dbus-daemon, but only the sender's. With kdbus, we implement a bus-layer. This is our only target! If your target environment does not require a bus, then don't use kdbus. We don't intend to replace UDS. On a bus-layer, we need peer-discovery, policy-handling, destination-lookups, broadcast-management and more. Pipes/UDS do not provide any of this. I cannot see how any other existing bus-implementation comes even close to kdbus, performance-wise. If someone does, please let us know! Thanks David -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html