Hi Mingwei, On Mon, Dec 05, 2022 at 05:51:13AM +0000, Mingwei Zhang wrote: > On Mon, Nov 14, 2022, Oliver Upton wrote: [...] > > As hyp stage-1 is protected by a spinlock there is no actual need for > > RCU in that case. I'll post something later on today that addresses the > > issue. > > > > For each stage-2 page table walk, KVM will use > kvm_mmu_topup_memory_cache() before taking the mmu lock. This ensures > whoever holding the mmu lock won't sleep. hyp walkers seems to > miss this notion completely, whic makes me puzzeled. Using a spinlock > only ensures functionality but seems quite inefficient if the one who > holds the spinlock try to allocate pages and sleep... You're probably confused by my mischaracterization in the above paragraph. Hyp stage-1 walkers (outside of pKVM) are guarded with a mutex and are perfectly able to sleep. The erroneous application of RCU led to this path becoming non-sleepable, hence the bug. pKVM's own hyp stage-1 walkers are guarded by a spinlock, but the memory allocations come from its own allocator and there is no concept of a scheduler at EL2. > Why do we need an RCU lock here. Oh is it for batching? We definitely don't need RCU here, thus the corrective measure was to avoid RCU for exclusive table walks. https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git/commit/?h=next&id=b7833bf202e3068abb77c642a0843f696e9c8d38 -- Thanks, Oliver