Re: [RFC PATCH v2 41/69] KVM: x86: Add infrastructure for stolen GPA bits

"Edgecombe, Rick P" <rick.p.edgecombe@xxxxxxxxx> · Thu, 5 Aug 2021 18:43:35 +0000

On Thu, 2021-08-05 at 17:39 +0000, Sean Christopherson wrote:
> On Thu, Aug 05, 2021, Edgecombe, Rick P wrote:
> > On Thu, 2021-08-05 at 16:06 +0000, Sean Christopherson wrote:
> > > On Thu, Aug 05, 2021, Kai Huang wrote:
> > > > And removing 'gfn_stolen_bits' in 'struct kvm_mmu_page' could
> > > > also save
> > > > some memory.
> > > 
> > > But I do like saving memory...  One potentially bad idea would be
> > > to
> > > unionize gfn and stolen bits by shifting the stolen bits after
> > > they're
> > > extracted from the gpa, e.g.
> > > 
> > > 	union {
> > > 		gfn_t gfn_and_stolen;
> > > 		struct {
> > > 			gfn_t gfn:52;
> > > 			gfn_t stolen:12;
> > > 		}
> > > 	};
> > > 
> > > the downsides being that accessing just the gfn would require an
> > > additional
> > > masking operation, and the stolen bits wouldn't align with
> > > reality.
> > 
> > It definitely seems like the sp could be packed more efficiently.
> 
> Yeah, in general it could be optimized.  But for TDP/direct MMUs, we
> don't care
> thaaat much because there are relatively few shadow pages, versus
> indirect MMUs
> with thousands or tens of thousands of shadow pages.  Of course,
> indirect MMUs
> are also the most gluttonous due to the unsync_child_bitmap, gfns,
> write flooding
> count, etc...
> 
> If we really want to reduce the memory footprint for the common case
> (TDP MMU),
> the crud that's used only by indirect shadow pages could be shoved
> into a
> different struct by abusing the struct layout and and wrapping
> accesses to the
> indirect-only fields with casts/container_of and helpers, e.g.
> 
Wow, didn't realize classic MMU was that relegated already. Mostly an
onlooker here, but does TDX actually need classic MMU support? Nice to
have?

> struct kvm_mmu_indirect_page {
> 	struct kvm_mmu_page this;
> 
> 	gfn_t *gfns;
> 	unsigned int unsync_children;
> 	DECLARE_BITMAP(unsync_child_bitmap, 512);
> 
> #ifdef CONFIG_X86_32
> 	/*
> 	 * Used out of the mmu-lock to avoid reading spte values while
> an
> 	 * update is in progress; see the comments in
> __get_spte_lockless().
> 	 */
> 	int clear_spte_count;
> #endif
> 
> 	/* Number of writes since the last time traversal visited this
> page.  */
> 	atomic_t write_flooding_count;
> }
> 
> 
> > One other idea is the stolen bits could just be recovered from the
> > role
> > bits with a helper, like how the page fault error code stolen bits
> > encoding version of this works.
> 
> As in, a generic "stolen_gfn_bits" in the role instead of a per-
> feature role bit?
> That would avoid the problem of per-feature role bits leading to a
> pile of
> marshalling code, and wouldn't suffer the masking cost when accessing
> ->gfn,
> though I'm not sure that matters much.
Well I was thinking multiple types of aliases, like the pf err code
stuff works, like this:

gfn_t stolen_bits(struct kvm *kvm, struct kvm_mmu_page *sp)
{
	gfn_t stolen = 0;

	if (sp->role.shared)
		stolen |= kvm->arch.gfn_shared_mask;
	if (sp->role.other_alias)
		stolen |= kvm->arch.gfn_other_mask;

	return stolen;
}

But yea, there really only needs to be one. Still bit shifting seems
better.

> 
> > If the stolen bits are not fed into the hash calculation though it
> > would change the behavior a bit. Not sure if for better or worse.
> > Also
> > the calculation of hash collisions would need to be aware.
> 
> The role is already factored into the collision logic.
I mean how aliases of the same gfn don't necessarily collide and the
collisions counter is only incremented if the gfn/stolen matches, but
not if the role is different.

> 
> > FWIW, I kind of like something like Sean's proposal. It's a bit
> > convoluted, but there are more unused bits in the gfn than the
> > role.
> 
> And tightly bound, i.e. there can't be more than gfn_t gfn+gfn_stolen
> bits.
> 
> > Also they are a little more related.
> 
>