On 17/7/2023 11:25 pm, Paolo Bonzini wrote:
On Mon, Jul 17, 2023 at 1:58 PM Like Xu <like.xu.linux@xxxxxxxxx> wrote:
- Use a different data type to track the producers and consumers so that lookups
don't require a linear walk. AIUI, the "tokens" used to match producers and
consumers are just kernel pointers, so I _think_ XArray would perform reasonably
well.
My measurements show that there is little performance gain from optimizing lookups.
How did you test this?
Paolo
First of all, I agree that the use of linear lookups here is certainly not
optimal, and meanwhile the point is that it's not the culprit for the long
delay of irq_bypass_register_consumer().
Based on the user-supplied kvm_irqfd_fork load, we note that this is a test
scenario where there are no producers and the number of consumer is growing
linearly, and we note that the time delay [*] for two list_for_each_entry()
walks (w/o xArray proposal) is:
- avg = 444773 ns
- min = 44 ns
- max = 1865008 ns
[*] calculate sched_clock() delta on 2.70GHz ICX
Compare this with the wait time delay on mutex_lock(&lock):
- avg = 117.855314 ms
- min = 20 ns
- max = 11428.340858 ms
It's fair to say that optimizing the lock bottleneck has greater
performance gain, right?
Please let me know what ideas you have to move this forward.