On Fri, Jan 14, 2011 at 08:54:15AM +0200, Michael S. Tsirkin wrote: > On Fri, Jan 14, 2011 at 03:35:28PM +0900, Simon Horman wrote: > > On Fri, Jan 14, 2011 at 06:58:18AM +0200, Michael S. Tsirkin wrote: > > > On Fri, Jan 14, 2011 at 08:41:36AM +0900, Simon Horman wrote: > > > > On Thu, Jan 13, 2011 at 10:45:38AM -0500, Jesse Gross wrote: > > > > > On Thu, Jan 13, 2011 at 1:47 AM, Simon Horman <horms@xxxxxxxxxxxx> wrote: > > > > > > On Mon, Jan 10, 2011 at 06:31:55PM +0900, Simon Horman wrote: > > > > > >> On Fri, Jan 07, 2011 at 10:23:58AM +0900, Simon Horman wrote: > > > > > >> > On Thu, Jan 06, 2011 at 05:38:01PM -0500, Jesse Gross wrote: > > > > > >> > > > > > > >> > [ snip ] > > > > > >> > > > > > > > >> > > I know that everyone likes a nice netperf result but I agree with > > > > > >> > > Michael that this probably isn't the right question to be asking. ÂI > > > > > >> > > don't think that socket buffers are a real solution to the flow > > > > > >> > > control problem: they happen to provide that functionality but it's > > > > > >> > > more of a side effect than anything. ÂIt's just that the amount of > > > > > >> > > memory consumed by packets in the queue(s) doesn't really have any > > > > > >> > > implicit meaning for flow control (think multiple physical adapters, > > > > > >> > > all with the same speed instead of a virtual device and a physical > > > > > >> > > device with wildly different speeds). ÂThe analog in the physical > > > > > >> > > world that you're looking for would be Ethernet flow control. > > > > > >> > > Obviously, if the question is limiting CPU or memory consumption then > > > > > >> > > that's a different story. > > > > > >> > > > > > > >> > Point taken. I will see if I can control CPU (and thus memory) consumption > > > > > >> > using cgroups and/or tc. > > > > > >> > > > > > >> I have found that I can successfully control the throughput using > > > > > >> the following techniques > > > > > >> > > > > > >> 1) Place a tc egress filter on dummy0 > > > > > >> > > > > > >> 2) Use ovs-ofctl to add a flow that sends skbs to dummy0 and then eth1, > > > > > >> Â Âthis is effectively the same as one of my hacks to the datapath > > > > > >> Â Âthat I mentioned in an earlier mail. The result is that eth1 > > > > > >> Â Â"paces" the connection. > > > > > > This is actually a bug. This means that one slow connection will affect > > > fast ones. I intend to change the default for qemu to sndbuf=0 : this > > > will fix it but break your "pacing". So pls do not count on this > > > behaviour. > > > > Do you have a patch I could test? > > You can (and users already can) just run qemu with sndbuf=0. But if you > like, below. Thanks > > > > > > Further to this, I wonder if there is any interest in providing > > > > > > a method to switch the action order - using ovs-ofctl is a hack imho - > > > > > > and/or switching the default action order for mirroring. > > > > > > > > > > I'm not sure that there is a way to do this that is correct in the > > > > > generic case. It's possible that the destination could be a VM while > > > > > packets are being mirrored to a physical device or we could be > > > > > multicasting or some other arbitrarily complex scenario. Just think > > > > > of what a physical switch would do if it has ports with two different > > > > > speeds. > > > > > > > > Yes, I have considered that case. And I agree that perhaps there > > > > is no sensible default. But perhaps we could make it configurable somehow? > > > > > > The fix is at the application level. Run netperf with -b and -w flags to > > > limit the speed to a sensible value. > > > > Perhaps I should have stated my goals more clearly. > > I'm interested in situations where I don't control the application. > > Well an application that streams UDP without any throttling > at the application level will break on a physical network, right? > So I am not sure why should one try to make it work on the virtual one. > > But let's assume that you do want to throttle the guest > for reasons such as QOS. The proper approach seems > to be to throttle the sender, not have a dummy throttled > receiver "pacing" it. Place the qemu process in the > correct net_cls cgroup, set the class id and apply a rate limit? I would like to be able to use a class to rate limit egress packets. That much works fine for me. What I would also like is for there to be back-pressure such that the guest doesn't consume lots of CPU, spinning, sending packets as fast as it can, almost of all of which are dropped. That does seem like a lot of wasted CPU to me. Unfortunately there are several problems with this and I am fast concluding that I will need to use a CPU cgroup. Which does make some sense, as what I am really trying to limit here is CPU usage not network packet rates - even if the test using the CPU is netperf. So long as the CPU usage can (mostly) be attributed to the guest using a cgroup should work fine. And indeed seems to in my limited testing. One scenario in which I don't think it is possible for there to be back-pressure in a meaningful sense is if root in the guest sets /proc/sys/net/core/wmem_default to a large value, say 2000000. I do think that to some extent there is back-pressure provided by sockbuf in the case where process on the host is sending directly to a physical interface. And to my mind it would be "nice" if the same kind of back-pressure was present in guests. But through our discussions of the past week or so I get the feeling that is not your view of things. Perhaps I could characterise the guest situation by saying: Egress packet rates can be controlled using tc on the host; Guest CPU usage can be controlled using CPU cgroups on the host; Sockbuf controls memory usage on the host; Back-pressure is irrelevant. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html