Il 29/07/2013 18:24, Gleb Natapov ha scritto: > On Mon, Jul 29, 2013 at 04:12:33PM +0200, Paolo Bonzini wrote: >> Il 29/07/2013 15:20, Gleb Natapov ha scritto: >>>> 2) in cases like this you just do not use likely/unlikely; the branch >>>> will be very unlikely in the beginning, and very likely once shadow >>>> pages are filled or in the no-EPT case. Just let the branch predictor >>>> adjust, it will probably do better than boolean tricks. >>>> >>> likely/unlikely are usually useless anyway. If you can avoid if() >>> altogether this is a win since there is no branch to predict. >> >> However, if the branches are dynamically well-predicted, >> >> if (simple) >> ... >> if (complex) >> ... >> >> is likely faster than >> >> if (simple | complex) >> >> because the branches then are very very cheap, and it pays off to not >> always evaluate the complex branch. > > Good point about about "|" always evaluating both. Is this the case > with if (simple !=0 | complex != 0) too where theoretically compiler may > see that if simple !=0 is true no need to evaluate the second one? Yes (only if complex doesn't have any side effects, which is the case here). >> Yeah, I also thought of always checking bad_mt_xwr and even using it to >> subsume the present check too, i.e. turning it into >> is_rsvd_bits_set_or_nonpresent. It checks the same bits that are used >> in the present check (well, a superset). You can then check for >> presence separately if you care, which you don't in >> prefetch_invalid_gpte. It requires small changes in the callers but >> nothing major. > > I do not get what is_rsvd_bits_set_or_nonpresent() will check exactly > and why do we needed it, there are two places where we check > present/reserved and in one of them we need to know which one it is. You can OR bad_mt_xwr with 0x5555555555555555ULL (I think). Then your implementation of is_rsvd_bits_set() using bad_mt_xwr will return true in all cases where the pte is non-present. You can then call is_present_pte to discriminate the two cases. if (is_rsvd_bits_set_or_nonpresent) { if (!present) ... else ... } In more abstract terms this is: if (simple) ... if (complex) ... to if (simple_or_complex) { if (simple) ... else ... } This can actually make sense if simple is almost always false, because then you save something from not evaluating it on the fast path. But in this case, adding bad_mt_xwr to the non-EPT case is a small loss. > Anyway order of checks in prefetch_invalid_gpte() is not relevant to > that patchset, so lets better leave it to a separate discussion. Yes. Paolo >> >> But it still seems to me that we're in the above "if (simple || >> complex)" case and having a separate "if (!present)" check will be faster. >> >> Paolo > > -- > Gleb. > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html