Re: [RFC PATCH v2 41/69] KVM: x86: Add infrastructure for stolen GPA bits

Sean Christopherson <seanjc@xxxxxxxxxx> · Thu, 5 Aug 2021 16:06:41 +0000




On Thu, Aug 05, 2021, Kai Huang wrote:
> On Fri, 2 Jul 2021 15:04:47 -0700 isaku.yamahata@xxxxxxxxx wrote:
> > From: Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx>
> > @@ -2020,6 +2032,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
> >  	sp = kvm_mmu_alloc_page(vcpu, direct);
> >  
> >  	sp->gfn = gfn;
> > +	sp->gfn_stolen_bits = gfn_stolen_bits;
> >  	sp->role = role;
> >  	hlist_add_head(&sp->hash_link, sp_list);
> >  	if (!direct) {
> > @@ -2044,6 +2057,13 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
> >  	return sp;
> >  }
> 
> 
> Sorry for replying old thread,

Ha, one month isn't old, it's barely even mature.

> but to me it looks weird to have gfn_stolen_bits
> in 'struct kvm_mmu_page'.  If I understand correctly, above code basically
> means that GFN with different stolen bit will have different 'struct
> kvm_mmu_page', but in the context of this patch, mappings with different
> stolen bits still use the same root,

You're conflating "mapping" with "PTE".  The GFN is a per-PTE value.  Yes, there
is a final GFN that is representative of the mapping, but more directly the final
GFN is associated with the leaf PTE.

TDX effectively adds the restriction that all PTEs used for a mapping must have
the same shared/private status, so mapping and PTE are somewhat interchangeable
when talking about stolen bits (the shared bit), but in the context of this patch,
the stolen bits are a property of the PTE.

Back to your statement, it's incorrect.  PTEs (effectively mappings in TDX) with
different stolen bits will _not_ use the same root.  kvm_mmu_get_page() includes
the stolen bits in both the hash lookup and in the comparison, i.e. restores the
stolen bits when looking for an existing shadow page at the target GFN.

@@ -1978,9 +1990,9 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
                role.quadrant = quadrant;
        }

-       sp_list = &vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)];
+       sp_list = &vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn_and_stolen)];
        for_each_valid_sp(vcpu->kvm, sp, sp_list) {
-               if (sp->gfn != gfn) {
+               if ((sp->gfn | sp->gfn_stolen_bits) != gfn_and_stolen) {
                        collisions++;
                        continue;
                }

> which means gfn_stolen_bits doesn't make a lot of sense at least for root
> page table. 

It does make sense, even without a follow-up patch.  In Rick's original series,
stealing a bit for execute-only guest memory, there was only a single root.  And
except for TDX, there can only ever be a single root because the shared EPTP isn't
usable, i.e. there's only the regular/private EPTP.

> Instead, having gfn_stolen_bits in 'struct kvm_mmu_page' only makes sense in
> the context of TDX, since TDX requires two separate roots for private and
> shared mappings.

> So given we cannot tell whether the same root, or different roots should be
> used for different stolen bits, I think we should not add 'gfn_stolen_bits' to
> 'struct kvm_mmu_page' and use it to determine whether to allocate a new table
> for the same GFN, but should use a new role (i.e role.private) to determine.

A new role would work, too, but it has the disadvantage of not automagically
working for all uses of stolen bits, e.g. XO support would have to add another
role bit.

> And removing 'gfn_stolen_bits' in 'struct kvm_mmu_page' could also save some
> memory.

But I do like saving memory...  One potentially bad idea would be to unionize
gfn and stolen bits by shifting the stolen bits after they're extracted from the
gpa, e.g.

	union {
		gfn_t gfn_and_stolen;
		struct {
			gfn_t gfn:52;
			gfn_t stolen:12;
		}
	};

the downsides being that accessing just the gfn would require an additional masking
operation, and the stolen bits wouldn't align with reality.