On 1/25/25 00:44, Sean Christopherson wrote:
I did try a long SRCU critical section and it was unreviewable. It
ends up a lot less manageable than just making the lock a leaf,
especially as the lock hierarchy spans multiple subsystems (static
key, KVM, cpufreq---thanks CPU hotplug lock...).
I'm not following. If __kvmclock_cpufreq_notifier() and set_nx_huge_pages()
switch to SRCU, then the deadlock goes away (it might even go away if just one
of those two switches). [...] It would be single use in the it only
protects pure reader of vm_list, but there are plenty of those users.
Yes, single use in the sense that only set_nx_huge_pages() really needs
it. kvm_lock doesn't produce any noticeable contention and as you noted
sometimes you really need it even in pure readers.
SRCU readers would only interact with kvm_destroy_vm() from a locking perspective,
and if that's problematic then we would already have a plethora of issues.
Ah yeah, I missed that you cannot hold any lock when calling
kvm_put_kvm(). So the waiting side is indeed a leaf and cannot block
someone else.
Still from your patch (thanks!) I don't really like the special cases on
taking SRCU vs. kvm_lock... It really seems like a job for a mutex or
rwsem. It keeps the complexity in the one place that is different (i.e.
where a lock is taken inside the iteration) and everything else can just
iterate normally.
Paolo