On Wed, Aug 29, 2012 at 04:10:57PM -0300, Marcelo Tosatti wrote: > On Tue, Aug 28, 2012 at 06:57:56PM +0900, Takuya Yoshikawa wrote: > > On Mon, 27 Aug 2012 17:25:42 -0300 > > Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote: > > > > > On Fri, Aug 24, 2012 at 06:15:49PM +0900, Takuya Yoshikawa wrote: > > > > Although returning -1 should be likely according to the likely(), > > > > the ASSERT in apic_find_highest_irr() will be triggered in such a case. > > > > It seems that this optimization is not working as expected. > > > > > > > > This patch simplifies the logic to mitigate this issue: search for the > > > > first non-zero word in a for loop and then use __fls() if found. When > > > > nothing found, we are out of the loop, so we can just return -1. > > > > > > Numbers please? > > > > > > How about this? > > (It would be great if someone help me understand why I got the numbers.) > > Michael, can you explain the reasoning? This text: + if (likely(!word_offset && !word[0])) + return -1; is a left-over from the original implementation. There we did a ton of gratitious calls to interrupt injection so it was important to streamline that path: lookups in emoty ISR and IRR. This is less common nowdays, since we have kvm_make_request, additionally, 8680b94b0e6046af2644c17313287ec0cb5843dc means for ISR lookups we do not need to scan the vector at all, and 33e4c68656a2e461b296ce714ec322978de85412 means we never need to scan IRR if it is empty. So I think likely() here really became bogus by now. But optimizing the vector lookups is a wrong thing to do I suspect: these basically should almost never trigger. Besides a single possible case: a single bit in IRR might still be somewhat common I think. If we want to optimize the case of a single bit set in IRR. my guess is the best thing is to replace irr_pending with generalization of isr_count/highest_isr_cache so we can have a highest IRR cache. This will avoid scans completely. > > Subject: [PATCH -v2] KVM: x86: lapic: Fix the misuse of likely() in find_highest_vector() > > > > Although returning -1 should be likely according to the likely(), not > > all callers expect such a result: > > > > Call sites: > > - apic_search_irr() > > - apic_find_highest_irr() --> ASSERT(result == -1 || result >= 16); > > - apic_clear_irr() > > - apic_find_highest_isr() --> ASSERT(result == -1 || result >= 16); > > > > The likely() might have made sense if returning -1 in apic_clear_irr() > > dominated the usage. But what I saw, by inserting counters in > > find_highest_vector(), was not like that: > > > > When the guest was being booted up, the counter for the "likely" case > > alone increased rapidly and exceeded 1,000,000. Then, after the guest > > booted up, the other cases started to happen and the "likely"/"others" > > ratio was between 1/4 to 1/3. Especially when ping from the guest to > > host was running, this was very clear: > > > > * NOT FOUND : "likely" (return -1) > > * WORD FOUND: "others" > > * printed when counter % 1000 == 0 > > > > [ 3566.796755] KVM: find_highest_vector: WORD FOUND 1707000 > > [ 3573.716623] KVM: find_highest_vector: WORD FOUND 1708000 > > [ 3575.666378] KVM: find_highest_vector: WORD FOUND 1709000 > > [ 3580.514253] KVM: find_highest_vector: NOT FOUND 1574000 > > [ 3586.763738] KVM: find_highest_vector: WORD FOUND 1710000 > > [ 3593.506674] KVM: find_highest_vector: WORD FOUND 1711000 > > [ 3595.766714] KVM: find_highest_vector: WORD FOUND 1712000 > > [ 3600.523654] KVM: find_highest_vector: NOT FOUND 1575000 > > [ 3607.485739] KVM: find_highest_vector: WORD FOUND 1713000 > > [ 3614.009400] KVM: find_highest_vector: WORD FOUND 1714000 > > [ 3616.669787] KVM: find_highest_vector: WORD FOUND 1715000 > > [ 3620.971443] KVM: find_highest_vector: NOT FOUND 1576000 > > > > This result shows that the likely() was not likely at all after the > > guest booted up. > > > > This patch simplifies the code to mitigate this issue: search for the > > first non-zero word in a for loop and then use __fls() if found. When > > nothing found, we are out of the loop, so we can just return -1. > > > > With this patch applied, even when we see successive "not found", CPU > > will predict things much better than us. > > > > Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@xxxxxxxxxxxxx> > > --- > > arch/x86/kvm/lapic.c | 18 ++++++++++-------- > > 1 files changed, 10 insertions(+), 8 deletions(-) > > > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > > index 18d149d..5eb4dde 100644 > > --- a/arch/x86/kvm/lapic.c > > +++ b/arch/x86/kvm/lapic.c > > @@ -208,16 +208,18 @@ static unsigned int apic_lvt_mask[APIC_LVT_NUM] = { > > > > static int find_highest_vector(void *bitmap) > > { > > - u32 *word = bitmap; > > - int word_offset = MAX_APIC_VECTOR >> 5; > > + u32 *p = bitmap; > > + u32 word; > > + int word_offset; > > > > - while ((word_offset != 0) && (word[(--word_offset) << 2] == 0)) > > - continue; > > + for (word_offset = (MAX_APIC_VECTOR >> 5) - 1; > > + word_offset >= 0; --word_offset) { > > + word = p[word_offset << 2]; > > + if (word) > > + return __fls(word) + (word_offset << 5); > > + } > > > > - if (likely(!word_offset && !word[0])) > > - return -1; > > - else > > - return fls(word[word_offset << 2]) - 1 + (word_offset << 5); > > + return -1; > > } > > > > static u8 count_vectors(void *bitmap) > > -- > > 1.7.5.4 > > > > -- > > To unsubscribe from this list: send the line "unsubscribe kvm" in > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html