On Thu, 2022-11-17 at 20:00 -0800, David Matlack wrote: > On Thu, Nov 17, 2022 at 5:35 PM Robert Hoo <robert.hu@xxxxxxxxxxxxxxx > > wrote: > > > > On Thu, 2022-11-17 at 11:14 -0500, Paolo Bonzini wrote: > > > + > > > if (fault->nx_huge_page_workaround_enabled) > > > disallowed_hugepage_adjust(fault, > > > iter.old_spte, iter.level); > > > > > > > And here can also be improved, I think. > > > > tdp_mmu_for_each_pte(iter, mmu, fault->gfn, fault->gfn + 1) > > { > > - if (fault->nx_huge_page_workaround_enabled) > > + if (fault->huge_page_disallowed) > > > > in the case of !fault->exec && fault- > > >nx_huge_page_workaround_enabled, > > huge page should be still allowed, shouldn't it? > > > > If you agree, I can send out a patch for this. I've roughly tested > > this, with an ordinary guest boot, works normally. > > This check handles the case where a read or write fault occurs within > a region that has already been split due to an NX huge page. By NX huge page split, the sub-sptes are installed, if my understanding is right. So no fault should happen when next r/w access. > If we > recovered the NX Huge Page on such faults, the guest could end up > continuously faulting on the same huge page (e.g. if writing to one > page and executing from another within a GPA region backed by a huge > page). So instead, NX Huge Page recovery is done periodically by a > background thread. Do you mean the kvm_nx_huge_page_recovery_worker() kthread? My understanding is that it recycles SPs that was created by NX huge page split. This would cause above fault happened, I guess, i.e. the previously installed spte is zapped by the child SP recycled. OK, understand you point now, if let r/w access fault of your mentioned type skip disallowed_hugepage_adjust(), then it will break out and huge page will be installed. Then next exec access will cause the huge page split; then next r/w access fault will install a huge page again ... > > That being said, I'm not surprised you didn't encounter any issues > when testing. Now that the TDP MMU fully splits NX Huge Pages on > fault, such faults should be rare at best. Perhaps even impossible? Possible, and not rare, I added debug info in disallowed_hugepage_adjust() and showed hits. > Hm, can we can drop the call to disallowed_hugepage_adjust() > entirely? I guess not, keep it as is. Though rare, even impossible, what if is_nx_huge_page_enabled() changed during the run time? e.g. NX huge page enabled --> disabled, give it a chance to restore huge page mapping?