On Thu, 2021-08-05 at 17:39 +0000, Sean Christopherson wrote: > On Thu, Aug 05, 2021, Edgecombe, Rick P wrote: > > On Thu, 2021-08-05 at 16:06 +0000, Sean Christopherson wrote: > > > On Thu, Aug 05, 2021, Kai Huang wrote: > > > > And removing 'gfn_stolen_bits' in 'struct kvm_mmu_page' could > > > > also save > > > > some memory. > > > > > > But I do like saving memory... One potentially bad idea would be > > > to > > > unionize gfn and stolen bits by shifting the stolen bits after > > > they're > > > extracted from the gpa, e.g. > > > > > > union { > > > gfn_t gfn_and_stolen; > > > struct { > > > gfn_t gfn:52; > > > gfn_t stolen:12; > > > } > > > }; > > > > > > the downsides being that accessing just the gfn would require an > > > additional > > > masking operation, and the stolen bits wouldn't align with > > > reality. > > > > It definitely seems like the sp could be packed more efficiently. > > Yeah, in general it could be optimized. But for TDP/direct MMUs, we > don't care > thaaat much because there are relatively few shadow pages, versus > indirect MMUs > with thousands or tens of thousands of shadow pages. Of course, > indirect MMUs > are also the most gluttonous due to the unsync_child_bitmap, gfns, > write flooding > count, etc... > > If we really want to reduce the memory footprint for the common case > (TDP MMU), > the crud that's used only by indirect shadow pages could be shoved > into a > different struct by abusing the struct layout and and wrapping > accesses to the > indirect-only fields with casts/container_of and helpers, e.g. > Wow, didn't realize classic MMU was that relegated already. Mostly an onlooker here, but does TDX actually need classic MMU support? Nice to have? > struct kvm_mmu_indirect_page { > struct kvm_mmu_page this; > > gfn_t *gfns; > unsigned int unsync_children; > DECLARE_BITMAP(unsync_child_bitmap, 512); > > #ifdef CONFIG_X86_32 > /* > * Used out of the mmu-lock to avoid reading spte values while > an > * update is in progress; see the comments in > __get_spte_lockless(). > */ > int clear_spte_count; > #endif > > /* Number of writes since the last time traversal visited this > page. */ > atomic_t write_flooding_count; > } > > > > One other idea is the stolen bits could just be recovered from the > > role > > bits with a helper, like how the page fault error code stolen bits > > encoding version of this works. > > As in, a generic "stolen_gfn_bits" in the role instead of a per- > feature role bit? > That would avoid the problem of per-feature role bits leading to a > pile of > marshalling code, and wouldn't suffer the masking cost when accessing > ->gfn, > though I'm not sure that matters much. Well I was thinking multiple types of aliases, like the pf err code stuff works, like this: gfn_t stolen_bits(struct kvm *kvm, struct kvm_mmu_page *sp) { gfn_t stolen = 0; if (sp->role.shared) stolen |= kvm->arch.gfn_shared_mask; if (sp->role.other_alias) stolen |= kvm->arch.gfn_other_mask; return stolen; } But yea, there really only needs to be one. Still bit shifting seems better. > > > If the stolen bits are not fed into the hash calculation though it > > would change the behavior a bit. Not sure if for better or worse. > > Also > > the calculation of hash collisions would need to be aware. > > The role is already factored into the collision logic. I mean how aliases of the same gfn don't necessarily collide and the collisions counter is only incremented if the gfn/stolen matches, but not if the role is different. > > > FWIW, I kind of like something like Sean's proposal. It's a bit > > convoluted, but there are more unused bits in the gfn than the > > role. > > And tightly bound, i.e. there can't be more than gfn_t gfn+gfn_stolen > bits. > > > Also they are a little more related. > >