On 2015-04-23 12:40, Paolo Bonzini wrote: > > > On 23/04/2015 23:13, Liang Li wrote: >> Romove lazy FPU logic and use eager FPU entirely. Eager FPU does >> not have performance regression, and it can simplify the code. >> >> When compiling kernel on westmere, the performance of eager FPU >> is about 0.4% faster than lazy FPU. >> >> Signed-off-by: Liang Li <liang.z.li@xxxxxxxxx> >> Signed-off-by: Xudong Hao <xudong.hao@xxxxxxxxx> > > A patch like this requires much more benchmarking than what you have done. > > First, what guest did you use? A modern Linux guest will hardly ever exit > to userspace: the scheduler uses the TSC deadline timer, which is handled > in the kernel; the clocksource uses the TSC; virtio-blk devices are kicked > via ioeventfd. > > What happens if you time a Windows guest (without any Hyper-V enlightenments), > or if you use clocksource=acpi_pm? > > Second, "0.4%" by itself may not be statistically significant. How did > you gather the result? How many times did you run the benchmark? Did > the guest report any stolen time? > > > And finally, even if the patch was indeed a performance improvement, > there is much more that you can remove. fpu_active is always 1, > vmx_fpu_activate only has one call site that can be simplified just to > > vcpu->arch.cr0_guest_owned_bits = X86_CR0_TS; > vmcs_writel(CR0_GUEST_HOST_MASK, ~vcpu->arch.cr0_guest_owned_bits); > > and so on. And it would be good to know how the benchmarks look like on other CPUs than the chosen Intel model. Including older ones. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html