Hi Chase, Yes, it took that long for me to get back to the NV series. Sorry about that. On Wed, 14 Jul 2021 17:40:03 +0100, Chase Conklin <chase.conklin@xxxxxxx> wrote: > I'm noticing a hang while an L2 is booting. From what I can tell, the > L0 is issuing TLBIs to the wrong VMID, so the L2 is getting stuck > taking the same abort repeatedly. > > It seems that kvm_unmap_stage2_range doesn't perform the invalidations > using the mmu passed to it here. Instead, it uses the passed mmu to > get back the kvm before passing that to stage2_apply_range which gets > its mmu from kvm->arch.mmu. This has the effect of applying > invalidations intended for the nested stage-2 of the L2 onto the > stage-2 for the L1. > > It also turns out that for the L2, the mmu != mmu->pgt->mmu. This is > because pgt->mmu is always set to &kvm->arch.mmu by > kvm_pgtable_stage2_init_flags. This too will cause the VMID for the > TLBI to be incorrect because the stage2_unmap_walker gets its mmu from > the pgt passed to it. Yup, and Ganapatrao noticed the same thing[1] (I obviously botched the conversion to the new pgtable code). I *think* this is now fixed in my nv-5.16 branch, but I'd really appreciate if you could have a look. Bonus points if you have access to actual HW (even in emulation), as doing this on the model is majorly frustrating. Thanks, M. [1] https://lore.kernel.org/r/20211122095803.28943-1-gankulkarni@xxxxxxxxxxxxxxxxxxxxxx -- Without deviation from the norm, progress is not possible.