On Sun, Mar 04, 2012, Michael S. Tsirkin wrote about "Re: [PATCH] vhost: don't forget to schedule()": > On Sun, Mar 04, 2012 at 11:10:01AM +0200, Nadav Har'El wrote: > > On Tue, Feb 28, 2012, Avi Kivity wrote about "Re: [PATCH] vhost: don't forget to schedule()": > > > > > > + if (need_resched()) > > > > > > + schedule(); > > > > > This is cond_resched(), no? >... > > Hi. This discussion is already getting several orders of magnitude longer > > than the patch :-) >... > > The argument is mostly about what's faster, right? > You can push the argument along if you run some benchmarks to show the > performance impact of the proposed variants. Hi, I did some tests using Netperf TCP-STREAM, message size 1K, over a 10Gbps network. Each of the numbers below are an average of several runs. Test 1: "First, Do no harm" - verify that with just one guest, when these extra reschedule attempts can't help, they at least don't hurt. * Unmodified vhost-net: 3135 Mbps. * My patch, with if(need_sched())schedule(): 3088 Mbps. The 1.5% average performance decrease is very small, and close the to measurement variation (1.4%) so I'd say it's negligable. * Avi's proposed alternative, with cond_sched() had, suprisingly, a very high variation in the results. Two measurments yielded an average of 3038 Mbps (close, but slightly lower than above), but 6 other measurements yielded significantly lower numbers, between 2300 and 2875. So it seems my patch does not hurt performance, so we don't need to find a "better" version for performance's sake. The cond_resched() version was worse - had a slightly worse maximum performance, and a high variance, so I don't recommend it. That being said, I don't have a good explanation on why cond_resched() is bad. Test 2: Demonstrate that the patch is actually important. Here we run two or three guests doing the same Netperf to a different server, with all 2 or 3 vhost threads' affinity set to the same core. * Behaviour of the unmodified vhost-net was terrible, and demonstrates what this bug fixes: - For two guests, one yielded a throughput of 3172 Mbps, but the second yielded only 0.5 Mbps (!!), because its vhost thread hardly ran at all. For three guests, again, one guest yielded 3202 Mbps, but the second and third each yielded less than 1Mbps. - "top" and similar utilities did not show the vhost threads at all, because CPU time is not accounted if you never schedule(). * Performance with our patch (if(need_sched())schedule()), fixes both problems: - For two guests, one yields 1557 Mbps, the second 1531 Mbps. Fair, and efficient (1557+1531=3088), as expected. For three guests, the number sare 1032+1028+1045=3105. - "top" shows each vhost thread to take close to the expect half or third of the core running both. * With the alternative cond_resched(), again the problems are fixed, but performance is less consistent, varying by almost 20% from run to run. Again, I can't explaine why. All it all, I think this proves that the patch I sent, as is, is important, and does no harm, and definitely not worse (and perhaps better) than the proposed alternative implementation with cond_resched. So please apply. Thanks, Nadav. -- Nadav Har'El | Sunday, Mar 18 2012, nyh@xxxxxxxxxxxxxxxxxxx |----------------------------------------- Phone +972-523-790466, ICQ 13349191 |If glory comes after death, I'm not in a http://nadav.harel.org.il |hurry. (Latin proverb) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html