Re: kvm_read_guest_page() missing kvm->srcu read lock?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 11/05/18 12:43, Paolo Bonzini wrote:
> On 11/05/2018 13:02, Andre Przywara wrote:
>> Hi Paolo,
>>
>> thanks for the answer!
>> Took me a bit, but I think you are right (see below).
>>
>> On 10/05/18 18:43, Paolo Bonzini wrote:
>>> On 10/05/2018 19:41, Andre Przywara wrote:
>>>> Hi,
>>>>
>>>> Jan posted an lockdep splat complaining about a suspicious
>>>> rcu_dereference_check:
>>>> https://lists.cs.columbia.edu/pipermail/kvmarm/2018-May/031116.html
>>>>
>>>> The gist of that is:
>>>> ...
>>>> [ 1025.695517]  dump_stack+0x9c/0xd4
>>>> [ 1025.695524]  lockdep_rcu_suspicious+0xcc/0x118
>>>> [ 1025.695537]  gfn_to_memslot+0x174/0x190
>>>> [ 1025.695546]  kvm_read_guest+0x50/0xb0
>>>> [ 1025.695553]  vgic_its_check_id.isra.0+0x114/0x148
>>>> ...
>>>> I chased that down and wonder if kvm_read_guest{,_page} is supposed to
>>>> be called inside a kvm->srcu critical section?
>>>>
>>>> We have a check that suggests that eventually someone needs to enter the
>>>> SRCU criticial section:
>>>> static inline struct kvm_memslots *__kvm_memslots(struct kvm *kvm,
>>>> 						  int as_id)
>>>> {
>>>>         return srcu_dereference_check(kvm->memslots[as_id], &kvm->srcu,
>>>>                         lockdep_is_held(&kvm->slots_lock) ||
>>>>                         !refcount_read(&kvm->users_count));
>>>> }
>>>>
>>>> If I get this correctly this mean for accessing kvm->memslots we either
>>>> need to be inside an srcu critical section or hold the kvm->slots_lock
>>>> (for updates only).
>>>>
>>>> If I am not mistaken, it is not necessary for *callers* of
>>>> kvm_read_guest_page() to do this, as this could be entirely contained
>>>> inside this function - since we only use the reference to the memslot
>>>> entry while doing the copy_from_user(), and the data is safe afterwards
>>>> from an RCU point of view because it has been *copied*.
>>>
>>> Yes, it's the caller's responsibility.  srcu_read_lock/unlock is pretty
>>> expensive
>>
>> Is that so? I was under the impression that declaring RCU critical
>> sections is very cheap, is that different with SRCU?
> 
> Yes, because RCU effectively lets the scheduler do the expensive parts.
> With SRCU you have to do them yourself with the advantage that: 1) you
> can sleep during RCU critical sections; 2) synchronize_srcu is much
> cheaper than synchronize_rcu and synchronize_sched.
> 
> It is still relatively cheap, and it doesn't serialize against writers,
> but the order of magnitude is 100 clock cycles for each of lock and
> unlock.  Compared with rcu_read_lock/unlock, which are nops on any
> kernel but PREEMPT_RT, that counts as expensive. :)
> 
>>> so KVM assumes that the topmost callers do it.
>>
>> OK, fair enough. And with some hints from Jörg I understand now that x86
>> and s390 do a "srcu_read_lock(&kvm->srcu);" right after leaving the
> :> guest and unlock it only shortly before entering again, so that any
>> intermediate calls are protected. That leaves the locking duty only up
>> to calls originating from userspace.
>> But AFAICT neither mips, powerpc or arm/arm64 are doing this. I am
>> checking now whether this is an omission or whether they are really
>> doing fine grained locking for all memslots accesses.
> 
> Ok, let me Cc the maintainers.  I suppose at least some of them do use
> lockdep from time to time, but it is certainly possible that some cases
> have been missed.

Thanks for that. As far as I can see, mips seems to be safe, because
they don't use kvm_read_guest() and the other memslot references seem to
be properly protected. So James can enjoy this weekend ;-)

For powerpc it's a bit more complex, I tried to chase down at least the
four users of kvm_read_guest():
- The one in arch/powerpc/kvm/book3s_rtas.c is safe.
- The two users in arch/powerpc/kvm/book3s_64_mmu_radix.c don't seem to
take any locks, but are only called when creating the VM, so that's
supposedly somewhat safe (?)
- I couldn't find any protection for the usage in
arch/powerpc/kvm/powerpc.c, but the call chain is quite convoluted
there, so I might have missed something. It would be good if someone
more familiar with this code would take a look.

> Adding the srcu_read_lock/unlock directly in kvm_arch_vcpu_ioctl_run and
> any other ioctls that need it is best, but in any case adding more pairs
> is safe because they can be nested.

So I added a small wrapper around kvm_read_guest(), which takes and
drops the lock. Will send out the patch shortly. If powerpc needs it, I
am happy to provide this wrapper in kvm_main.c instead of some arm
header file instead.

Cheers,
Andre.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux