Re: Potential race in TLB flush batching?

Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx> · Fri, 14 Jul 2017 19:02:57 +1000

On Fri, 2017-07-14 at 09:31 +0100, Mel Gorman wrote:
> It may also be only a gain on a limited number of architectures depending
> on exactly how an architecture handles flushing. At the time, batching
> this for x86 in the worse-case scenario where all pages being reclaimed
> were mapped from multiple threads knocked 24.4% off elapsed run time and
> 29% off system CPU but only on multi-socket NUMA machines. On UMA, it was
> barely noticable. For some workloads where only a few pages are mapped or
> the mapped pages on the LRU are relatively sparese, it'll make no difference.
> 
> The worst-case situation is extremely IPI intensive on x86 where many
> IPIs were being sent for each unmap. It's only worth even considering if
> you see that the time spent sending IPIs for flushes is a large portion
> of reclaim.

Ok, it would be interesting to see how that compares to powerpc with
its HW tlb invalidation broadcasts. We tend to hate them and prefer
IPIs in most cases but maybe not *this* case .. (mostly we find that
IPI + local inval is better for large scale invals, such as full mm on
exit/fork etc...).

In the meantime I found the original commits, we'll dig and see if it's
useful for us.

Cheers,
Ben.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>