On Wed, Jul 28, 2021 at 09:04:30PM +0000, Sean Christopherson wrote: > > struct pte_list_desc { > > u64 *sptes[PTE_LIST_EXT]; > > + /* > > + * Stores number of entries stored in the pte_list_desc. No need to be > > + * u64 but just for easier alignment. When PTE_LIST_EXT, means full. > > + */ > > + u64 spte_count; > > Per my feedback to the previous patch, this should be above sptes[] so that rmaps > with <8 SPTEs only touch one cache line. No idea if it actually matters in > practice, but I can't see how it would harm anything. Since at it, I'll further move "more" to be at the entry too, so I think it optimizes full entries case too. /* * Slight optimization of cacheline layout, by putting `more' and `spte_count' * at the start; then accessing it will only use one single cacheline for * either full (entries==PTE_LIST_EXT) case or entries<=6. */ struct pte_list_desc { struct pte_list_desc *more; /* * Stores number of entries stored in the pte_list_desc. No need to be * u64 but just for easier alignment. When PTE_LIST_EXT, means full. */ u64 spte_count; u64 *sptes[PTE_LIST_EXT]; }; Thanks, -- Peter Xu