On Thu, Aug 30, 2012 at 10:09:06AM +0900, Takuya Yoshikawa wrote: > On Thu, 30 Aug 2012 01:51:20 +0300 > "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote: > > > This text: > > + if (likely(!word_offset && !word[0])) > > + return -1; > > is a left-over from the original implementation. > > > > There we did a ton of gratitious calls to interrupt > > injection so it was important to streamline that path: > > lookups in emoty ISR and IRR. > > > > This is less common nowdays, since we have kvm_make_request, > > additionally, 8680b94b0e6046af2644c17313287ec0cb5843dc > > means for ISR lookups we do not need to scan > > the vector at all, and > > 33e4c68656a2e461b296ce714ec322978de85412 > > means we never need to scan IRR if it is empty. > > > > So I think likely() here really became bogus by now. > > Yes, thank you for the explanation. > > > But optimizing the vector lookups is a wrong thing to do > > I suspect: these basically should almost never trigger. > > This patch is not optimizing things at all, just removing > unnatural logic which might be justified only for using that > bogus likely(). > > I explain this below. > > > Besides a single possible case: a single bit in IRR might > > still be somewhat common I think. > > Then, the current code is doing very bad thing: > > while ((word_offset != 0) && (word[(--word_offset) << 2] == 0)) > continue; > > if (likely(!word_offset && !word[0])) > return -1; > else > return fls(word[word_offset << 2]) - 1 + (word_offset << 5); > > - checking word[0] separately does not make sense > - using fls(), not __fls(), means we doubly check (word[i] == 0) > > Actually, how can this likely() work so effectively even when we > return -1? If we do (word[0] == 0) check in the same loop, CPU > should naturally predict the result: > > for (word_offset = (MAX_APIC_VECTOR >> 5) - 1; > word_offset >= 0; --word_offset) { > word = p[word_offset << 2]; > if (word) > return __fls(word) + (word_offset << 5); > } > > return -1; > > > If we want to optimize the case of a single bit set in IRR. > > my guess is the best thing is to replace irr_pending with > > generalization of isr_count/highest_isr_cache so we can have > > a highest IRR cache. This will avoid scans completely. > > Yes, but let's first fix the wrong code. > > Thanks, > Takuya After staring at your code for a while it does appear to do the right thing, and looks cleaner than what we have now. commit log could be clearer. It should state something like: Clean up code in find_highest_vector: - likely() is there for historical reasons, it is no longer clear it's optimizing for the correct branch, and find_highest_vector might not be worth optimizing at all - checking word[0] separately does not make sense: if (word[word_offset << 2]) would be clearer - since we test word[...] != 0 beforehand, we can use __fls instead of fls() - for loop iterating over an array is clearer than while Since you are going for cleanups, maybe we could also add: - get rid of ugly >> 5 << 2, switch to using REG_POS instead? Something like the below pseudo code would do this I think? #define APIC_VECTORS_PER_REG 32 int vec; for (vec = MAX_APIC_VECTOR - APIC_VECTORS_PER_REG; vec -= APIC_VECTORS_PER_REG; vec >= 0) { u32 *reg = bitmap + REG_POS(vec); if (*reg) return __fls(*reg) - 1 + vec; } return -1 count_vectors similar: for (vec = 0; vec < MAX_APIC_VECTOR; vec += APIC_VECTORS_PER_REG) { u32 *reg = bitmap + REG_POS(vec); count += hweight32(*reg); } hmm? -- MST -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html