Re: [PATCH] KVM: x86: lapic: Fix the misuse of likely() in find_highest_vector()

"Michael S. Tsirkin" <mst@xxxxxxxxxx> · Thu, 30 Aug 2012 09:37:02 +0300

On Thu, Aug 30, 2012 at 10:09:06AM +0900, Takuya Yoshikawa wrote:
> On Thu, 30 Aug 2012 01:51:20 +0300
> "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote:
> 
> > This text:
> > +       if (likely(!word_offset && !word[0]))
> > +               return -1;
> > is a left-over from the original implementation.
> > 
> > There we did a ton of gratitious calls to interrupt
> > injection so it was important to streamline that path:
> > lookups in emoty ISR and IRR.
> > 
> > This is less common nowdays, since we have kvm_make_request,
> > additionally, 8680b94b0e6046af2644c17313287ec0cb5843dc
> > means for ISR lookups we do not need to scan
> > the vector at all, and
> > 33e4c68656a2e461b296ce714ec322978de85412
> > means we never need to scan IRR if it is empty.
> > 
> > So I think likely() here really became bogus by now.
> 
> Yes, thank you for the explanation.
> 
> > But optimizing the vector lookups is a wrong thing to do
> > I suspect: these basically should almost never trigger.
> 
> This patch is not optimizing things at all, just removing
> unnatural logic which might be justified only for using that
> bogus likely().
> 
> I explain this below.
> 
> > Besides a single possible case: a single bit in IRR might
> > still be somewhat common I think.
> 
> Then, the current code is doing very bad thing:
> 
>   while ((word_offset != 0) && (word[(--word_offset) << 2] == 0))
>       continue;
> 
>   if (likely(!word_offset && !word[0]))
>       return -1;
>   else
>       return fls(word[word_offset << 2]) - 1 + (word_offset << 5);
> 
> - checking word[0] separately does not make sense
> - using fls(), not __fls(), means we doubly check (word[i] == 0)
> 
> Actually, how can this likely() work so effectively even when we
> return -1?  If we do (word[0] == 0) check in the same loop, CPU
> should naturally predict the result:
> 
>   for (word_offset = (MAX_APIC_VECTOR >> 5) - 1;
>        word_offset >= 0; --word_offset) {
>       word = p[word_offset << 2];
>       if (word)
>           return __fls(word) + (word_offset << 5);
>   }
> 
>   return -1;
> 
> > If we want to optimize the case of a single bit set in IRR.
> > my guess is the best thing is to replace irr_pending with
> > generalization of isr_count/highest_isr_cache so we can have
> > a highest IRR cache. This will avoid scans completely.
> 
> Yes, but let's first fix the wrong code.
> 
> Thanks,
> 	Takuya

After staring at your code for a while it does appear to
do the right thing, and looks cleaner than what
we have now. commit log could be clearer.
It should state something like:

Clean up code in find_highest_vector:
 - likely() is there for historical reasons, it is no longer
   clear it's optimizing for the correct branch,
   and find_highest_vector might not be worth optimizing at all
 - checking word[0] separately does not make sense:
   if (word[word_offset << 2]) would be clearer
 - since we test word[...] != 0 beforehand, we can use __fls
   instead of fls()
 - for loop iterating over an array is clearer than while

Since you are going for cleanups, maybe we could also add:
- get rid of ugly >> 5 << 2, switch to using REG_POS instead?

Something like the below pseudo code would do this I think?

#define APIC_VECTORS_PER_REG 32

	int vec;
	for (vec = MAX_APIC_VECTOR - APIC_VECTORS_PER_REG;
	     vec -= APIC_VECTORS_PER_REG; vec >= 0) {
		u32 *reg = bitmap + REG_POS(vec);
		if (*reg)
			return __fls(*reg) - 1 + vec;
	}
	return -1

count_vectors similar:

        for (vec = 0; vec < MAX_APIC_VECTOR; vec += APIC_VECTORS_PER_REG) {
		u32 *reg = bitmap + REG_POS(vec);
                count += hweight32(*reg);
	}

hmm?

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html