On Sat, Jan 08, 2022, Vihas Mak wrote: > On Fri, Jan 07, 2022, Sean Christopherson wrote: > >> NAK, this change is functionally wrong. The checks are inside the loop because > >> the flow fails if and only if there is at least one indirect, valid shadow pages > >> at the target gfn. The @prefetch check is even more restrictive as it bails if > >> there is at least one indirect, valid, synchronized shadow page. > >> The can_unsync check could be "optimized" to > >> > >> if (!can_unsync && kvm_gfn_has_indirect_valid_sp()) > >> > >> but identifying whether or not there's a valid SP requires walking the list of > >> shadow pages for the gfn, so it's simpler to just handle the check in the loop. > >> And "optimized" in quotes because both checks will be well-predicted single-uop > >> macrofused TEST+Jcc on modern CPUs, whereas walking the list twice would be > >> relatively expensive if there are shadow pages for the gfn. > > > So this change isn't safe. I will look into the optimization suggested > by Sean. Sorry for this patch. Heh, I wasn't actually suggesting we do the "optimization". I was pointing out what the code would look like _if_ we wanted to move the checks out of the loop, but I do not actually think we should make any changes, quite the opposite. The theoretical worst case if there no indirect valid SPs, but lots of direct and/or invalid SPs is far worse than burning a few uops per loop. The compiler is smart enough to handle the checks out of line, and I'm sure there are other optimizations being made as well. In other words, odds are very good that trying to optimize the code will do more harm than good.