Re: [PATCH 12/15] KVM: arm64: vgic-its: Pick cache victim based on usage count

Oliver Upton <oliver.upton@xxxxxxxxx> · Thu, 25 Jan 2024 15:34:31 +0000

On Thu, Jan 25, 2024 at 10:55:19AM +0000, Marc Zyngier wrote:
> On Wed, 24 Jan 2024 20:49:06 +0000, Oliver Upton <oliver.upton@xxxxxxxxx> wrote:

[...]

> > +static struct vgic_translation_cache_entry *vgic_its_cache_victim(struct vgic_dist *dist)
> > +{
> > +	struct vgic_translation_cache_entry *cte, *victim = NULL;
> > +	u64 min, tmp;
> > +
> > +	/*
> > +	 * Find the least used cache entry since the last cache miss, preferring
> > +	 * older entries in the case of a tie. Note that usage accounting is
> > +	 * deliberately non-atomic, so this is all best-effort.
> > +	 */
> > +	list_for_each_entry(cte, &dist->lpi_translation_cache, entry) {
> > +		if (!cte->irq)
> > +			return cte;
> > +
> > +		tmp = atomic64_xchg_relaxed(&cte->usage_count, 0);
> > +		if (!victim || tmp <= min) {
> 
> min is not initialised until after the first round. Not great. How
> comes the compiler doesn't spot this?

min never gets read on the first iteration, since victim is known to be
NULL. Happy to initialize it though to keep this more ovbviously sane.

> > +			victim = cte;
> > +			min = tmp;
> > +		}
> > +	}
> 
> So this resets all the counters on each search for a new insertion?
> Seems expensive, specially on large VMs (512 * 16 = up to 8K SWP
> instructions in a tight loop, and I'm not even mentioning the fun
> without LSE). I can at least think of a box that will throw its
> interconnect out of the pram it tickled that way.

Well, each cache eviction after we hit the cache limit. I wrote this up
to have _something_ that allowed the rculist conversion to later come
back to rework futher, but that obviously didn't happen.

> I'd rather the new cache entry inherits the max of the current set,
> making it a lot cheaper. We can always detect the overflow and do a
> full invalidation in that case (worse case -- better options exist).

Yeah, I like your suggested approach. I'll probably build a bit on top
of that.

> > +
> > +	return victim;
> > +}
> > +
> >  static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
> >  				       u32 devid, u32 eventid,
> >  				       struct vgic_irq *irq)
> > @@ -645,9 +664,12 @@ static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
> >  		goto out;
> >  
> >  	if (dist->lpi_cache_count >= vgic_its_max_cache_size(kvm)) {
> > -		/* Always reuse the last entry (LRU policy) */
> > -		victim = list_last_entry(&dist->lpi_translation_cache,
> > -				      typeof(*cte), entry);
> > +		victim = vgic_its_cache_victim(dist);
> > +		if (WARN_ON_ONCE(!victim)) {
> > +			victim = new;
> > +			goto out;
> > +		}
>
> I don't understand how this could happen. It sort of explains the
> oddity I was mentioning earlier, but I don't think we need this
> complexity.

The only way it could actually happen is if a bug were introduced where
lpi_cache_count is somehow nonzero but the list is empty. But yeah, we
can dump this and assume we find a victim, which ought to always be
true.

-- 
Thanks,
Oliver