Re: [PATCH V3] arm64: Don't flush tlb while clearing the accessed bit

Alexander Van Brunt <avanbrunt@xxxxxxxxxx> · Sat, 8 Dec 2018 00:05:44 +0000

> > > My problem with that is it's not really much different to just skipping the
> > > page table update entirely. Skipping the DSB is closer to what is done on
> > > x86, where we bound the stale entry time to the next context-switch.
> >  
> > Which of the three implementations is the "that" and "it" in the first sentence?
>
> that = it = skipping the whole invalidation + the DSB

The TLB is tiny compared to the size of the inactive list. Somehow a TLB has to
not be evicted during the page's life in the inactive list. That is not an easy
feat except for the hottest of pages.

If there is a context-switch, most of the original thread's TLBs will be
evicted because TLBs have a hard time to hold two thread's working sets. So, in
practice, that is almost the same as the x86 guarantee.

The worst case cannot have a large impact because the maximum number of pages
that will not have the TLB evicted is the number of pages in the TLB. For
example, a 1024 entry TLB can at worst result in 4 MB of pages erroneously
reclaimed. That is not bad on a system with 4+ GB of memory.

We did benchmark the extreme case where half the pages accessed where not
evicted from the TLB. In the read case, skipping the DSB was ~10% faster than
skipping the invalidate or doing the invalidate and the DSB.

Compared to the improvement in the average performance and variability in the
other cases we tested, the 10% loss in a carefully crafted test is not as
important.