On 11/16/2016 11:23 AM, Peter Zijlstra wrote: > On Wed, Nov 16, 2016 at 12:19:09PM +0800, Pan Xinhui wrote: >> Hi, Peter. >> I think we can avoid a function call in a simpler way. How about below >> >> static inline bool vcpu_is_preempted(int cpu) >> { >> /* only set in pv case*/ >> if (pv_lock_ops.vcpu_is_preempted) >> return pv_lock_ops.vcpu_is_preempted(cpu); >> return false; >> } > > That is still more expensive. It needs to do an actual load and makes it > hard to predict the branch, you'd have to actually wait for the load to > complete etc. Out of curiosity, why is that hard to predict? On s390 the branch prediction runs asynchronously ahead of the downstream pipeline (e.g. search for "IBM z Systems Processor Optimization Primer" page 11). given enough capacity, I would assume that modern x86 processors would do the same and be able to predict this is as soon as it becomes hot (and otherwise you would not notice the branch miss anyway). Is x86 behaving differently here? > Also, it generates more code. > > Paravirt muck should strive to be as cheap as possible when ran on > native hardware. As I am interested in this series from the s390 point of view, this is the only thing that block this series? Is there a chance to add a static key around the paravirt ops somehow? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html