Re: [PATCH 00/15] KVM: arm64: Improvements to GICv3 LPI injection

Marc Zyngier <maz@xxxxxxxxxx> · Thu, 25 Jan 2024 11:02:01 +0000

Hi Oliver,

On Wed, 24 Jan 2024 20:48:54 +0000,
Oliver Upton <oliver.upton@xxxxxxxxx> wrote:
> 
> The unfortunate reality is there are increasingly large systems that are
> shipping today without support for GICv4 vLPI injection. Serialization
> in KVM's LPI routing/injection code has been a significant bottleneck
> for VMs on these machines when under a high load of LPIs (e.g. a
> multi-queue NIC).
> 
> Even though the long-term solution is quite clearly **direct
> injection**, we really ought to do something about the LPI scaling
> issues within KVM.
> 
> This series aims to improve the performance of LPI routing/injection in
> KVM by moving readers of LPI configuration data away from the
> lpi_list_lock in favor or using RCU.
> 
> Patches 1-5 change out the representation of LPIs in KVM from a
> linked-list to an xarray. While not strictly necessary for making the
> locking improvements, this seems to be an opportune time to switch to a
> data structure that can actually be indexed.
> 
> Patches 6-10 transition vgic_get_lpi() and vgic_put_lpi() away from
> taking the lpi_list_lock in favor of using RCU for protection. Note that
> this requires some rework to the way references are taken on LPIs and
> how reclaim works to be RCU safe.
> 
> Lastly, patches 11-15 rework the LRU policy on the LPI translation cache
> to not require moving elements in the linked-list and take advantage of
> this to make it an rculist readable outside of the lpi_list_lock.

I quite like the overall direction. I've left a few comments here and
there, and will probably get back to it after I try to run some tests
on a big-ish box.

> All of this was tested on top of v6.8-rc1. Apologies if any of the
> changelogs are a bit too light, I'm happy to rework those further in
> subsequent revisions.
> 
> I would've liked to have benchmark data showing the improvement on top
> of upstream with this series, but I'm currently having issues with our
> internal infrastructure and upstream kernels. However, this series has
> been found to have a near 2x performance improvement to redis-memtier [*]
> benchmarks on our kernel tree.

It'd be really good to have upstream-based numbers, with details of
the actual setup (device assignment? virtio?) so that we can compare
things and even track regressions in the future.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.