On Thu, Jan 09, 2020 at 10:26:30AM -0500, Arvind Sankar wrote: > On Thu, Jan 09, 2020 at 02:13:48PM +0000, David Laight wrote: > > From: Sean Christopherson > > > Sent: 08 January 2020 00:19 > > > > > > Use a Logical OR in __is_rsvd_bits_set() to combine the two reserved bit > > > checks, which are obviously intended to be logical statements. Switching > > > to a Logical OR is functionally a nop, but allows the compiler to better > > > optimize the checks. > > > > > > Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > > > --- > > > arch/x86/kvm/mmu/mmu.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > > index 7269130ea5e2..72e845709027 100644 > > > --- a/arch/x86/kvm/mmu/mmu.c > > > +++ b/arch/x86/kvm/mmu/mmu.c > > > @@ -3970,7 +3970,7 @@ __is_rsvd_bits_set(struct rsvd_bits_validate *rsvd_check, u64 pte, int level) > > > { > > > int bit7 = (pte >> 7) & 1, low6 = pte & 0x3f; > > > > > > - return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) | > > > + return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) || > > > ((rsvd_check->bad_mt_xwr & (1ull << low6)) != 0); > > > > Are you sure this isn't deliberate? > > The best code almost certainly comes from also removing the '!= 0'. The '!= 0' is truly superfluous, removing it doesn't affect code generation. > > You also don't want to convert the expression result to zero. > > The function is static inline bool, so it's almost certainly a mistake > originally. The != 0 is superfluous, but this will get inlined anyway. Ya, the bitwise-OR was added in commit 25d92081ae2f ("nEPT: Add nEPT violation/misconfigration support"), and AFAICT it's unintentional. That being said, I was a bit hasty in stating that a logical-OR allows for better optimization, sort of. For FNAME(prefetch_invalid_gpte) and FNAME(walk_addr_generic), which branch on the result of is_rsvd_bits_set(), the logical-OR is marginally better. FNAME(prefetch_invalid_gpte) is what I initially looked at when saying "yep, that's better!". But for walk_shadow_page_get_mmio_spte(), because it aggregates the result in a loop, the bitwise-OR is better in that it eliminates a Jcc. And all that being said, there are two vastly superior optimizations that can be made: - Reorder the checks in FNAME(prefetch_invalid_gpte) to perform the !PRESENT and !ACCESSED checks before checking the reserved bits, as they are both more likely to fail and do not require additional memory accesses. - Rewrite __is_rsvd_bits_set() to make it templated. The reserved MT check is EPT only, i.e. bad_mt_xwr is always 0 for legacy 32/64-bit paging. So, I'll scrap this patch and send a mini series to effect the above optimizations. > > > > So: > > return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) | (rsvd_check->bad_mt_xwr & (1ull << low6)); > > The code then doesn't have any branches to get mispredicted. > > > > David > > > > - > > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > > Registration No: 1397386 (Wales) > >