On Thu, 2024-10-10 at 10:33 -0700, Sean Christopherson wrote: > > > > 1st: "fault->is_private != kvm_mem_is_private(kvm, fault->gfn)" is found. > > 2nd-6th: try_cmpxchg64() fails on each level SPTEs (5 levels in total) Isn't there a more general scenario: vcpu0 vcpu1 1. Freezes PTE 2. External op to do the SEAMCALL 3. Faults same PTE, hits frozen PTE 4. Retries N times, triggers zero-step 5. Finally finishes external op Am I missing something? > > Very technically, this shouldn't be possible. The only way for there to be > contention on the leaf SPTE is if some other KVM task installed a SPTE, i.e. > the > 6th attempt should succeed, even if the faulting vCPU wasn't the one to create > the SPTE. > > That said, a few thoughts: > > 1. Where did we end up on the idea of requiring userspace to pre-fault memory? For others reference, I think you are referring to the idea to pre-fault the entire S-EPT even for GFNs that usually get AUGed, not the mirrored EPT pre- faulting/PAGE.ADD dance we are already doing. The last discussion with Paolo was to resume the retry solution discussion on the v2 posting because it would be easier "with everything else already addressed". Also, there was also some discussion that it was not immediately obvious how prefaulting everything would work for memory hot plug (i.e. memslots added during runtime). > > 2. The zero-step logic really should have a slightly more conservative > threshold. > I have a hard time believing that e.g. 10 attempts would create a side > channel, > but 6 attempts is "fine". No idea where the threshold came from. I'm not sure if it affects the KVM design? We can look into it for curiosity sake in either case. > > 3. This would be a good reason to implement a local retry in > kvm_tdp_mmu_map(). > Yes, I'm being somewhat hypocritical since I'm so against retrying for the > S-EPT case, but my objection to retrying for S-EPT is that it _should_ be > easy > for KVM to guarantee success. > > E.g. for #3, the below (compile tested only) patch should make it impossible > for > the S-EPT case to fail, as dirty logging isn't (yet) supported and mirror > SPTEs > should never trigger A/D assists, i.e. retry should always succeed. I don't see how it addresses the scenario above. More retires could just make it rarer, but never fix it. Very possible I'm missing something though.