On Tue, May 28, 2013 at 11:02:09PM +0800, Xiao Guangrong wrote: > On 05/28/2013 08:18 AM, Marcelo Tosatti wrote: > > On Mon, May 27, 2013 at 10:20:12AM +0800, Xiao Guangrong wrote: > >> On 05/25/2013 04:34 AM, Marcelo Tosatti wrote: > >>> On Thu, May 23, 2013 at 03:55:53AM +0800, Xiao Guangrong wrote: > >>>> Zap at lease 10 pages before releasing mmu-lock to reduce the overload > >>>> caused by requiring lock > >>>> > >>>> After the patch, kvm_zap_obsolete_pages can forward progress anyway, > >>>> so update the comments > >>>> > >>>> [ It improves kernel building 0.6% ~ 1% ] > >>> > >>> Can you please describe the overload in more detail? Under what scenario > >>> is kernel building improved? > >> > >> Yes. > >> > >> The scenario is we do kernel building, meanwhile, repeatedly read PCI rom > >> every one second. > >> > >> [ > >> echo 1 > /sys/bus/pci/devices/0000\:00\:03.0/rom > >> cat /sys/bus/pci/devices/0000\:00\:03.0/rom > /dev/null > >> ] > > > > Can't see why it reflects real world scenario (or a real world > > scenario with same characteristics regarding kvm_mmu_zap_all vs faults)? > > > > Point is, it would be good to understand why this change > > is improving performance? What are these cases where breaking out of > > kvm_mmu_zap_all due to either (need_resched || spin_needbreak) on zapped > > < 10 ? > > When guest read ROM, qemu will set the memory to map the device's firmware, > that is why kvm_mmu_zap_all can be called in the scenario. > > The reasons why it heart the performance are: > 1): Qemu use a global io-lock to sync all vcpu, so that the io-lock is held > when we do kvm_mmu_zap_all(). If kvm_mmu_zap_all() is not efficient, all > other vcpus need wait a long time to do I/O. > > 2): kvm_mmu_zap_all() is triggered in vcpu context. so it can block the IPI > request from other vcpus. > > Is it enough? That is no problem. The problem is why you chose "10" as the minimum number of pages to zap before considering reschedule. I would expect the need to reschedule to be rare enough that one kvm_mmu_zap_all instance (between schedule in and schedule out) to be able to release no less than a thousand pages. So i'd like to understand better what is the drive for this change (this was the original question). -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html