Re: [regression v4.0-rc1] mm: IPIs from TLB flushes causing significant performance degradation.

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Mon, 2 Mar 2015 11:47:52 -0800

On Sun, Mar 1, 2015 at 5:04 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>
> Across the board the 4.0-rc1 numbers are much slower, and the
> degradation is far worse when using the large memory footprint
> configs. Perf points straight at the cause - this is from 4.0-rc1
> on the "-o bhash=101073" config:
>
> -   56.07%    56.07%  [kernel]            [k] default_send_IPI_mask_sequence_phys
>       - 99.99% physflat_send_IPI_mask
>          - 99.37% native_send_call_func_ipi
..
>
> And the same profile output from 3.19 shows:
>
> -    9.61%     9.61%  [kernel]            [k] default_send_IPI_mask_sequence_phys
>      - 99.98% physflat_send_IPI_mask
>          - 96.26% native_send_call_func_ipi
...
>
> So either there's been a massive increase in the number of IPIs
> being sent, or the cost per IPI have greatly increased. Either way,
> the result is a pretty significant performance degradatation.

And on Mon, Mar 2, 2015 at 11:17 AM, Matt <jackdachef@xxxxxxxxx> wrote:
>
> Linus already posted a fix to the problem, however I can't seem to
> find the matching commit in his tree (searching for "TLC regression"
> or "TLB cache").

That was commit f045bbb9fa1b, which was then refined by commit
721c21c17ab9, because it turned out that ARM64 had a very subtle
relationship with tlb->end and fullmm.

But both of those hit 3.19, so none of this should affect 4.0-rc1.
There's something else going on.

I assume it's the mm queue from Andrew, so adding him to the cc. There
are changes to the page migration etc, which could explain it.

There are also a fair amount of APIC changes in 4.0-rc1, so I guess it
really could be just that the IPI sending itself has gotten much
slower. Adding Ingo for that, although I don't think
default_send_IPI_mask_sequence_phys() itself hasn't actually changed,
only other things around the apic. So I'd be inclined to blame the mm
changes.

Obviously bisection would find it..

                          Linus

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs