On Fri, May 17, 2024 at 06:16:26PM +0000, "Edgecombe, Rick P" <rick.p.edgecombe@xxxxxxxxx> wrote: > On Fri, 2024-05-17 at 02:03 -0700, Isaku Yamahata wrote: > > > > On top of your patch, I created the following patch to remove > > kvm_gfn_for_root(). > > Although I haven't tested it yet, I think the following shows my idea. > > > > - Add gfn_shared_mask to struct tdp_iter. > > - Use iter.gfn_shared_mask to determine the starting sptep in the root. > > - Remove kvm_gfn_for_root() > > I investigated it. Thanks for looking at it. > After this, gfn_t's never have shared bit. It's a simple rule. The MMU mostly > thinks it's operating on a shared root that is mapped at the normal GFN. Only > the iterator knows that the shared PTEs are actually in a different location. > > There are some negative side effects: > 1. The struct kvm_mmu_page's gfn doesn't match it's actual mapping anymore. > 2. As a result of above, the code that flushes TLBs for a specific GFN will be > confused. It won't functionally matter for TDX, just look buggy to see flushing > code called with the wrong gfn. flush_remote_tlbs_range() is only for Hyper-V optimization. In other cases, x86_op.flush_remote_tlbs_range = NULL or the member isn't defined at compile time. So the remote tlb flush falls back to flushing whole range. I don't expect TDX in hyper-V guest. I have to admit that the code looks superficially broken and confusing. > 3. A lot of tracepoints no longer have the "real" gfn Anyway we'd like to sort out trace points and pr_err() eventually because we already added new pferr flags. > 4. mmio spte doesn't have the shared bit, as previous (no effect) > 5. Some zapping code (__tdp_mmu_zap_root(), tdp_mmu_zap_leafs()) intends to > actually operating on the raw_gfn. It wants to iterate the whole EPT, so it goes > from 0 to tdp_mmu_max_gfn_exclusive(). So now for mirrored it does, but for > shared it only covers the shared range. Basically kvm_mmu_max_gfn() is wrong if > we pretend shared GFNs are just strangely mapped normal GFNs. Maybe we could > just fix this up to report based on GPAW for TDX? Feels wrong. Yes, it's broken with kvm_mmu_max_gfn(). > On the positive effects side: > 1. There is code that passes sp->gfn into things that it shouldn't (if it has > shared bits) like memslot lookups. > 2. Also code that passes iter.gfn into things it shouldn't like > kvm_mmu_max_mapping_level(). > > These places are not called by TDX, but if you know that gfn's might include > shared bits, then that code looks buggy. > > I think the solution in the diff is more elegant then before, because it hides > what is really going on with the shared root. That is both good and bad. Can we > accept the downsides? Kai, do you have any thoughts? -- Isaku Yamahata <isaku.yamahata@xxxxxxxxx>