Re: [RFC PATCH] KVM: arm/arm64: Enable direct irqfd MSI injection

Andre Przywara <andre.przywara@xxxxxxx> · Wed, 15 May 2019 17:38:32 +0100

On Mon, 18 Mar 2019 13:30:40 +0000
Marc Zyngier <marc.zyngier@xxxxxxx> wrote:

Hi,

> On Sun, 17 Mar 2019 19:35:48 +0000
> Marc Zyngier <marc.zyngier@xxxxxxx> wrote:
> 
> [...]
> 
> > A first approach would be to keep a small cache of the last few
> > successful translations for this ITS, cache that could be looked-up by
> > holding a spinlock instead. A hit in this cache could directly be
> > injected. Any command that invalidates or changes anything (DISCARD,
> > INV, INVALL, MAPC with V=0, MAPD with V=0, MOVALL, MOVI) should nuke
> > the cache altogether.  
> 
> And to explain what I meant with this, I've pushed a branch[1] with a
> basic prototype. It is good enough to get a VM to boot, but I wouldn't
> trust it for anything serious just yet.
> 
> If anyone feels like giving it a go and check whether it has any
> benefit performance wise, please do so.

So I took a stab at the performance aspect, and it took me a while to find
something where it actually makes a difference. The trick is to create *a
lot* of interrupts. This is my setup now:
- GICv3 and ITS
- 5.1.0 kernel vs. 5.1.0 plus Marc's rebased "ITS cache" patches on top
- 4 VCPU guest on a 4 core machine
- passing through a M.2 NVMe SSD (or a USB3 controller) to the guest
- running FIO in the guest, with:
  - 4K block size, random reads, queue depth 16, 4 jobs (small)
  - 1M block size, sequential reads, QD 1, 1 job (big)

For the NVMe disk I see a whopping 19% performance improvement with Marc's
series (for the small blocks). For a SATA SSD connected via USB3.0 I still
see 6% improvement. For NVMe there were 50,000 interrupts per second on
the host, the USB3 setup came only up to 10,000/s. For big blocks (with
IRQs in the low thousands/s) the win is less, but still a measurable 3%.

Now that I have the setup, I can rerun experiments very quickly (given I
don't loose access to the machine), so let me know if someone needs
further tests.

Cheers,
Andre.

> [1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/its-translation-cache