Hi Reiji, Thanks for the review! On Thu, Mar 31, 2022 at 11:07:23PM -0700, Reiji Watanabe wrote: > > @@ -1267,10 +1268,24 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > */ > > if (fault_status == FSC_PERM && vma_pagesize == fault_granule) { > > ret = kvm_pgtable_stage2_relax_perms(pgt, fault_ipa, prot); > > When use_read_lock is set to true, it appears the above condition will > become always true (since fault_granule is PAGE_SIZE and force_pte > is true in this case). So, I don't think the following "else" changes > really make any difference. (Or am I overlooking something?) > Looking at the code, I doubt that even the original (before the regression) > code detects the case that is described in the comment below in the > first place. Yes, you are right. I liked the explicit check against !use_read_lock (even if it is dead) to make it abundantly clear what is guarded by the write lock. Personally, I'm not a big fan of the conditional locking because it is hard to tell exactly what is protected by the read or write lock. Maybe instead a WARN() could suffice. That way we bark at anyone who changes MMU locking again and breaks it. I found the bug with the splat above, but warning about replacing an already present PTE is rather far removed from the smoking gun in this situation. Outside of the regression, I believe there are some subtle races where stage-2 is modified before taking the MMU lock. I'm going to think further about the implications of these, but perhaps we should pass through a page level argument to kvm_pagetable_stage2_relax_perms() and only do the operation if the paging structures match what is reported in the FSC. If the IPA is not in fact mapped at the specified page level, retry the faulting instruction. I'll get a patch up that addresses your comment and crams a WARN() in to assert the write lock is held. -- Thanks, Oliver