Rusty Russell wrote: > On Wednesday 01 April 2009 05:12:47 Gregory Haskins wrote: > >> Bare metal: tput = 4078Mb/s, round-trip = 25593pps (39us rtt) >> Virtio-net: tput = 4003Mb/s, round-trip = 320pps (3125us rtt) >> Venet: tput = 4050Mb/s, round-trip = 15255 (65us rtt) >> > > That rtt time is awful. I know the notification suppression heuristic > in qemu sucks. > > I could dig through the code, but I'll ask directly: what heuristic do > you use for notification prevention in your venet_tap driver? > I am not 100% sure I know what you mean with "notification prevention", but let me take a stab at it. So like most of these kinds of constructs, I have two rings (rx + tx on the guest is reversed to tx + rx on the host), each of which can signal in either direction for a total of 4 events, 2 on each side of the connection. I utilize what I call "bidirectional napi" so that only the first packet submitted needs to signal across the guest/host boundary. E.g. first ingress packet injects an interrupt, and then does a napi_schedule and masks future irqs. Likewise, first egress packet does a hypercall, and then does a "napi_schedule" (I dont actually use napi in this path, but its conceptually identical) and masks future hypercalls. So thats is my first form of what I would call notification prevention. The second form occurs on the "tx-complete" path (that is guest->host tx). I only signal back to the guest to reclaim its skbs every 10 packets, or if I drain the queue, whichever comes first (note to self: make this # configurable). The nice part about this scheme is it significantly reduces the amount of guest/host transitions, while still providing the lowest latency response for single packets possible. e.g. Send one packet, and you get one hypercall, and one tx-complete interrupt as soon as it queues on the hardware. Send 100 packets, and you get one hypercall and 10 tx-complete interrupts as frequently as every tenth packet queues on the hardware. There is no timer governing the flow, etc. Is that what you were asking? > As you point out, 350-450 is possible, which is still bad, and it's at least > partially caused by the exit to userspace and two system calls. If virtio_net > had a backend in the kernel, we'd be able to compare numbers properly. > :) But that is the whole point, isnt it? I created vbus specifically as a framework for putting things in the kernel, and that *is* one of the major reasons it is faster than virtio-net...its not the difference in, say, IOQs vs virtio-ring (though note I also think some of the innovations we have added such as bi-dir napi are helping too, but these are not "in-kernel" specific kinds of features and could probably help the userspace version too). I would be entirely happy if you guys accepted the general concept and framework of vbus, and then worked with me to actually convert what I have as "venet-tap" into essentially an in-kernel virtio-net. I am not specifically interested in creating a competing pv-net driver...I just needed something to showcase the concepts and I didnt want to hack the virtio-net infrastructure to do it until I had everyone's blessing. Note to maintainers: I *am* perfectly willing to maintain the venet drivers if, for some reason, we decide that we want to keep them as is. Its just an ideal for me to collapse virtio-net and venet-tap together, and I suspect our community would prefer this as well. -Greg
Attachment:
signature.asc
Description: OpenPGP digital signature