Anthony Liguori wrote: > Gregory Haskins wrote: >> Anthony Liguori wrote: >> I think there is a slight disconnect here. This is *exactly* what >> I am >> trying to do. > > If it were exactly what you were trying to do, you would have posted a > virtio-net in-kernel backend implementation instead of a whole new > paravirtual IO framework ;-) semantics, semantics ;) but ok, fair enough. > >>> That said, I don't think we're bound today by the fact that we're in >>> userspace. >>> >> You will *always* be bound by the fact that you are in userspace. > > Again, let's talk numbers. A heavy-weight exit is 1us slower than a > light weight exit. Ideally, you're taking < 1 exit per packet because > you're batching notifications. If you're ping latency on bare metal > compared to vbus is 39us to 65us, then all other things being equally, > the cost imposed by doing what your doing in userspace would make the > latency be 66us taking your latency from 166% of native to 169% of > native. That's not a huge difference and I'm sure you'll agree there > are a lot of opportunities to improve that even further. Ok, so lets see it happen. Consider the gauntlet thrown :) Your challenge, should you chose to accept it, is to take todays 4000us and hit a 65us latency target while maintaining 10GE line-rate (at least 1500 mtu line-rate). I personally don't want to even stop at 65. I want to hit that 36us! In case you think that is crazy, my first prototype of venet was hitting about 140us, and I shaved 10us here, 10us there, eventually getting down to the 65us we have today. The low hanging fruit is all but harvested at this point, but I am not done searching for additional sources of latency. I just needed to take a breather to get the code out there for review. :) > > And you didn't mention whether your latency tests are based on ping or > something more sophisticated Well, the numbers posted were actually from netperf -t UDP_RR. This generates a pps from a continuous (but non-bursted) RTT measurement. So I invert the pps result of this test to get the average rtt time. I have also confirmed that ping jives with these results (e.g. virtio-net results were about 4ms, and venet were about 0.065ms as reported by ping). > as ping will be a pathological case Ah, but this is not really pathological IMO. There are plenty of workloads that exhibit request-reply patterns (e.g. RPC), and this is a direct measurement of the systems ability to support these efficiently. And even unidirectional flows can be hampered by poor latency (think PTP clock sync, etc). Massive throughput with poor latency is like Andrew Tanenbaum's station-wagon full of backup tapes ;) I think I have proven we can actually get both with a little creative use of resources. > that doesn't allow any notification batching. Well, if we can take anything away from all this: I think I have demonstrated that you don't need notification batching to get good throughput. And batching on the head-end of the queue adds directly to your latency overhead, so I don't think its a good technique in general (though I realize that not everyone cares about latency, per se, so maybe most are satisfied with the status-quo). > >> I agree that the "does anyone care" part of the equation will approach >> zero as the latency difference shrinks across some threshold (probably >> the single microsecond range), but I will believe that is even possible >> when I see it ;) >> > > Note the other hat we have to where is not just virtualization > developer but Linux developer. If there are bad userspace interfaces > for IO that impose artificial restrictions, then we need to identify > those and fix them. Fair enough, and I would love to take that on but alas my development/debug bandwidth is rather finite these days ;) -Greg
Attachment:
signature.asc
Description: OpenPGP digital signature