Re: [bug report] KVM: x86/mmu: Use an rwlock for the x86 MMU

Ben Gardon <bgardon@xxxxxxxxxx> · Mon, 26 Jul 2021 09:47:43 -0700

On Mon, Jul 26, 2021 at 12:52 AM Dan Carpenter <dan.carpenter@xxxxxxxxxx> wrote:
>
> [ This is not the correct patch to blame, but there is something going
>   on here which I don't understand so this email is more about me
>   learning rather than reporting bugs. - dan ]
>
> Hello Ben Gardon,
>
> The patch 531810caa9f4: "KVM: x86/mmu: Use an rwlock for the x86 MMU"
> from Feb 2, 2021, leads to the following static checker warning:
>
>         arch/x86/kvm/mmu/mmu.c:5769 kvm_mmu_zap_all()
>         warn: sleeping in atomic context
>
> arch/x86/kvm/mmu/mmu.c
>     5756 void kvm_mmu_zap_all(struct kvm *kvm)
>     5757 {
>     5758        struct kvm_mmu_page *sp, *node;
>     5759        LIST_HEAD(invalid_list);
>     5760        int ign;
>     5761
>     5762        write_lock(&kvm->mmu_lock);
>                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
> This line bumps the preempt count.
>
>     5763 restart:
>     5764        list_for_each_entry_safe(sp, node, &kvm->arch.active_mmu_pages, link) {
>     5765                if (WARN_ON(sp->role.invalid))
>     5766                        continue;
>     5767                if (__kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list, &ign))
>     5768                        goto restart;
> --> 5769                if (cond_resched_rwlock_write(&kvm->mmu_lock))
>                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> This line triggers a sleeping in atomic warning.  What's going on here
> that I'm not understanding?

Hi Dan,

Thanks for sending this. I'm confused by this sequence too. I'm not
sure how this could sleep in an atomic context.
My first thought was that there might be something going on with the
qrwlock's wait_lock, but since this thread already acquired the
rwlock, it can't be holding / waiting on the wait_lock.

Then I thought the __might_sleep could be in the wrong place, but it's
in the same place for a regular spinlock, so I think that's fine.

I do note that __cond_resched_rwlock does not check rwlock_needbreak
like __cond_resched_lock checks spin_needbreak. That seems like an
oversight, but I don't see how it could cause this warning.

I'm as confused by this as you. Did you confirm that this sleeping in
atomic warning does not happen before this commit? What kind of
configuration are you able to reproduce this on?

It might be worth asking some sched / locking folks about this as
they'll likely have a better understanding of all the intricacies of
the layers of locking macros.
I'm very curious to understand what's causing this too.

Ben

>
>
>     5770                        goto restart;
>     5771        }
>     5772
>     5773        kvm_mmu_commit_zap_page(kvm, &invalid_list);
>     5774
>     5775        if (is_tdp_mmu_enabled(kvm))
>     5776                kvm_tdp_mmu_zap_all(kvm);
>     5777
>     5778        write_unlock(&kvm->mmu_lock);
>     5779 }
>
> regards,
> dan carpenter