Re: [RESEND RFC PATCH v1 2/5] arm64: Add BBM Level 2 cpu feature

Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> · Fri, 3 Jan 2025 18:18:36 +0000

On Fri, 3 Jan 2025 16:00:59 +0000
Ryan Roberts <ryan.roberts@xxxxxxx> wrote:

> On 03/01/2025 15:35, Will Deacon wrote:
> > On Thu, Jan 02, 2025 at 12:30:34PM +0000, Marc Zyngier wrote:  
> >> On Thu, 02 Jan 2025 12:07:04 +0000,
> >> Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> wrote:  
> >>> On Thu, 19 Dec 2024 16:45:28 +0000
> >>> Will Deacon <will@xxxxxxxxxx> wrote:  
> >>>> On Thu, Dec 12, 2024 at 04:03:52PM +0000, Ryan Roberts wrote:  
> >>>>>>>> If anything, this should absolutely check for FAR_EL1 and assert that
> >>>>>>>> this is indeed caused by such change.    
> >>>>>>>
> >>>>>>> I'm not really sure how we would check this reliably? Without patch 5, the
> >>>>>>> problem is somewhat constrained; we could have as many changes in flight as
> >>>>>>> there are CPUs so we could keep a list of all the {mm_struct, VA-range} that are
> >>>>>>> being modified. But if patch 5 is confirmed to be architecturally sound, then
> >>>>>>> there is no "terminating tlbi" so there is no bound on the set of {mm_struct,
> >>>>>>> VA-range}'s that could legitimately cause a conflict abort.    
> >>>>>>
> >>>>>> I didn't mean to imply that we should identify the exact cause of the
> >>>>>> abort. I was hoping to simply check that FAR_EL1 reports a userspace
> >>>>>> VA. Why wouldn't that work?    
> >>>>>
> >>>>> Ahh gottya! Yes agreed, this sounds like the right approach.    
> >>>>
> >>>> Please, can we just not bother handling conflict aborts at all outside of
> >>>> KVM? This is all dead code, it's complicated and it doesn't scale to the
> >>>> in-kernel use-cases that others want. There's also not been any attempt
> >>>> to add the pKVM support for handling host-side conflict aborts from what
> >>>> I can tell.
> >>>>
> >>>> For now, I would suggest limiting this series just to the KVM support
> >>>> for handling a broken/malicious guest. If the contpte performance
> >>>> improvements are worthwhile (I've asked for data), then let's add support
> >>>> for the CPUs that handle the conflict in hardware (I believe this is far
> >>>> more common than reporting the abort) so that the in-kernel users can
> >>>> benefit whilst keeping the code manageable at the same time.
> >>>>  
> >>>
> >>> Given direction the discussion is going in time to raise a hand.
> >>>
> >>> Huawei has implementations that support BBML2, and might report TLB conflict
> >>> abort after changing block size directly until an appropriate TLB invalidation
> >>> instruction completes and this Implementation Choice is architecturally compliant.  
> >>
> >> Compliant, absolutely. That's the letter of the spec. The usefulness
> >> aspect is, however, more debatable, and this is what Will is pointing
> >> out.
> >>
> >> Dealing with TLB Conflict aborts is an absolute pain if you need
> >> to handle it within the same Translation Regime and using the same
> >> TTBR as the one that has generated the fault. So at least for the time
> >> being, it might be preferable to only worry about the implementations
> >> that will promise to never generate such an abort and quietly perform
> >> an invalidation behind the kernel's back.  
> > 
> > Agreed. We're not dropping support for CPUs that don't give us what we'd
> > like here, we're just not bending over to port and maintain new
> > optimisations for them. I think that's a reasonable compromise?

Subject to usual maintainability vs performance questions sure.
Given we are the ones with the implementation, it's perhaps up to us
to prove the added complexity for a given optimization is worth the hassle
(maybe leaning on Arm to help out ;)  We have some activity going on
around this, but are unfortunately not ready to share.

> > 
> > That said, thanks for raising this, Jonathan. It's a useful data point
> > to know that TLB conflict aborts exist in the wild!  

My work here is done ;)

> 
> Indeed. Just to make it explicit; if we were to support all architecturally
> compliant BBML2 implementations, we would need to drop the final patch in this
> series. But since it sounds like we will be taking the approach of only allowing
> these optimizations for implementations that never raise conflict abort and
> handle it all in HW, it should be safe to keep the optimization in that final
> patch. I'll work with Miko to get this bashed into shape and reposted.

Obviously I'd want perf numbers to justify it (working on that) but I'd like
to keep on the table the option of patch 5 being the only part that is dependent
on being non conflict aborting hardware. I think even that is a performance
question rather than a correctness one - it simply widens the window in which
we might see a fault and have to dump the TLB. (I may well have missed something
though).

As a side note on that last patch, it is easy to conceive of a BBML2
solution that doesn't do conflict aborts, but for which it is still a performance
nightmare to not flush. As a fictional implementation, where our CPUs get a conflict
abort, we could instead have stalled the core and pushed the abort info to a management
controller, and flushed the whole TLB to resolve (plus probably the CPU pipeline).
If sufficiently rare that's not a totally stupid implementation (subject to some
optimizations). It is basically offloading what we are going to do in software on
a conflict abort with somewhat similar performance cost making widening the window
a very bad idea.

So the proposed allow list might need to be rather more nuanced than "can we get
a fault?"  We all love per uarch performance related opt ins.

Jonathan

> 
> Thanks,
> Ryan
> 
> > 
> > Will  
> 
>