Re: [PATCH V3] arm64: Don't flush tlb while clearing the accessed bit

Will Deacon <will.deacon@xxxxxxx> · Fri, 7 Dec 2018 17:53:31 +0000

On Thu, Dec 06, 2018 at 08:42:03PM +0000, Alexander Van Brunt wrote:
> > > > If we roll a TLB invalidation routine without the trailing DSB, what sort of
> > > > performance does that get you?
> > > 
> > > It is not as good. In some cases, it is really bad. Skipping the invalidate was
> > > the most consistent and fast implementation.
> 
> > My problem with that is it's not really much different to just skipping the
> > page table update entirely. Skipping the DSB is closer to what is done on
> > x86, where we bound the stale entry time to the next context-switch.
> 
> Which of the three implementations is the "that" and "it" in the first sentence?

that = it = skipping the whole invalidation + the DSB

> > Given that I already queued the version without the DSB, we have the choice
> > to either continue with that or to revert it and go back to the previous
> > behaviour. Which would you prefer?
> 
> To me, skipping the DSB is a win over doing the invalidate and the DSB because
> it is faster on average.
> 
> DSBs have a big impact on the performance of other CPUs in the inner shareable
> domain because of the ordering requirements. For example, we have observed
> Cortex A57s stalling all CPUs in the cluster until Device accesses complete.
> 
> Would you be open to a patch on top of the DSB skipping patch that skips the
> whole invalidate?

I don't think so; we don't have an upper bound on how long we'll have a
stale TLB if remove the invalidation completely.

Will