[4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Tejun,


commit d59cfc09c32a2ae31f1c3bc2983a0cd79afb3f14 (sched, cgroup: replace 
signal_struct->group_rwsem with a global percpu_rwsem) causes some noticably
hickups when starting several kvm guests (which libvirt will move into cgroups
- each vcpu thread and each i/o thread)
When you now start lots of guests in parallel on a bigger system (32CPUs with
2way smt in my case) the system is so busy that systemd runs into several timeouts
like "Did not receive a reply. Possible causes include: the remote application did
not send a reply, the message bus security policy blocked the reply, the reply
timeout expired, or the network connection was broken."

The problem seems to be that the newly used percpu_rwsem does a
rcu_synchronize_sched_expedited for all write downs/ups.

Hacking the percpu_rw_semaphore to always go the slow path and avoid the
synchronize_sched seems to fix the issue. For some (yet unknown to me) reason
the synchronize_sched and the fast path seems to block writers for incredibly
long times.

To trigger the problem, the guest must be CPU bound, iow idle guests seem to
not trigger the pathological case. Any idea how to improve the situation, 
this looks like a real regression for larger kvm installations.

Christian

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux