Re: kvm_read_guest_page() missing kvm->srcu read lock?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/05/2018 13:02, Andre Przywara wrote:
> Hi Paolo,
> 
> thanks for the answer!
> Took me a bit, but I think you are right (see below).
> 
> On 10/05/18 18:43, Paolo Bonzini wrote:
>> On 10/05/2018 19:41, Andre Przywara wrote:
>>> Hi,
>>>
>>> Jan posted an lockdep splat complaining about a suspicious
>>> rcu_dereference_check:
>>> https://lists.cs.columbia.edu/pipermail/kvmarm/2018-May/031116.html
>>>
>>> The gist of that is:
>>> ...
>>> [ 1025.695517]  dump_stack+0x9c/0xd4
>>> [ 1025.695524]  lockdep_rcu_suspicious+0xcc/0x118
>>> [ 1025.695537]  gfn_to_memslot+0x174/0x190
>>> [ 1025.695546]  kvm_read_guest+0x50/0xb0
>>> [ 1025.695553]  vgic_its_check_id.isra.0+0x114/0x148
>>> ...
>>> I chased that down and wonder if kvm_read_guest{,_page} is supposed to
>>> be called inside a kvm->srcu critical section?
>>>
>>> We have a check that suggests that eventually someone needs to enter the
>>> SRCU criticial section:
>>> static inline struct kvm_memslots *__kvm_memslots(struct kvm *kvm,
>>> 						  int as_id)
>>> {
>>>         return srcu_dereference_check(kvm->memslots[as_id], &kvm->srcu,
>>>                         lockdep_is_held(&kvm->slots_lock) ||
>>>                         !refcount_read(&kvm->users_count));
>>> }
>>>
>>> If I get this correctly this mean for accessing kvm->memslots we either
>>> need to be inside an srcu critical section or hold the kvm->slots_lock
>>> (for updates only).
>>>
>>> If I am not mistaken, it is not necessary for *callers* of
>>> kvm_read_guest_page() to do this, as this could be entirely contained
>>> inside this function - since we only use the reference to the memslot
>>> entry while doing the copy_from_user(), and the data is safe afterwards
>>> from an RCU point of view because it has been *copied*.
>>
>> Yes, it's the caller's responsibility.  srcu_read_lock/unlock is pretty
>> expensive
> 
> Is that so? I was under the impression that declaring RCU critical
> sections is very cheap, is that different with SRCU?

Yes, because RCU effectively lets the scheduler do the expensive parts.
With SRCU you have to do them yourself with the advantage that: 1) you
can sleep during RCU critical sections; 2) synchronize_srcu is much
cheaper than synchronize_rcu and synchronize_sched.

It is still relatively cheap, and it doesn't serialize against writers,
but the order of magnitude is 100 clock cycles for each of lock and
unlock.  Compared with rcu_read_lock/unlock, which are nops on any
kernel but PREEMPT_RT, that counts as expensive. :)

>> so KVM assumes that the topmost callers do it.
> 
> OK, fair enough. And with some hints from Jörg I understand now that x86
> and s390 do a "srcu_read_lock(&kvm->srcu);" right after leaving the
:> guest and unlock it only shortly before entering again, so that any
> intermediate calls are protected. That leaves the locking duty only up
> to calls originating from userspace.
> But AFAICT neither mips, powerpc or arm/arm64 are doing this. I am
> checking now whether this is an omission or whether they are really
> doing fine grained locking for all memslots accesses.

Ok, let me Cc the maintainers.  I suppose at least some of them do use
lockdep from time to time, but it is certainly possible that some cases
have been missed.

Adding the srcu_read_lock/unlock directly in kvm_arch_vcpu_ioctl_run and
any other ioctls that need it is best, but in any case adding more pairs
is safe because they can be nested.

Thanks for the report!

Paolo



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux