Re: Kernel Benchmarking

Michael Larabel <Michael@xxxxxxxxxxxxxxxxxx> · Sat, 12 Sep 2020 09:44:15 -0500

On 9/12/20 9:37 AM, Matthew Wilcox wrote:
On Sat, Sep 12, 2020 at 05:32:11AM -0500, Michael Larabel wrote:
On 9/12/20 2:28 AM, Amir Goldstein wrote:
On Sat, Sep 12, 2020 at 1:40 AM Michael Larabel
<Michael@xxxxxxxxxxxxxxxxxx> wrote:
On 9/11/20 5:07 PM, Linus Torvalds wrote:
On Fri, Sep 11, 2020 at 9:19 AM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
Ok, it's probably simply that fairness is really bad for performance
here in general, and that special case is just that - a special case,
not the main issue.
Ahh. It turns out that I should have looked more at the fault path
after all. It was higher up in the profile, but I ignored it because I
found that lock-unlock-lock pattern lower down.

The main contention point is actually filemap_fault(). Your apache
test accesses the 'test.html' file that is mmap'ed into memory, and
all the threads hammer on that one single file concurrently and that
seems to be the main page lock contention.

Which is really sad - the page lock there isn't really all that
interesting, and the normal "read()" path doesn't even take it. But
faulting the page in does so because the page will have a long-term
existence in the page tables, and so there's a worry about racing with
truncate.

Interesting, but also very annoying.

Anyway, I don't have a solution for it, but thought I'd let you know
that I'm still looking at this.

                   Linus
I've been running your EXT4 patch on more systems and with some
additional workloads today. While not the original problem, the patch
does seem to help a fair amount for the MariaDB database sever. This
wasn't one of the workloads regressing on 5.9 but at least with the
systems tried so far the patch does make a meaningful improvement to the
performance. I haven't run into any apparent issues with that patch so
continuing to try it out on more systems and other database/server
workloads.

Michael,

Can you please add a reference to the original problem report and
to the offending commit? This conversation appeared on the list without
this information.

Are filesystems other than ext4 also affected by this performance
regression?

Thanks,
Amir.
On Linux 5.9 Git, Apache HTTPD, Redis, Nginx, and Hackbench appear to be the
main workloads that are running measurably slower than on Linux 5.8 and
prior on multiple systems.

The issue was bisected to 2a9127fcf2296674d58024f83981f40b128fffea. The
Kernel Test Robot also previously was triggered by the commit in question
with mixed Hackbench results. In looking at the problem Linus had a hunch
when looking at the perf data that it may have had an adverse reaction with
the EXT4 locking behavior to which he sent out that patch. That EXT4 patch
didn't end up addressing the performance issue with the original workloads
in question (though in testing other workloads it seems to have benefit for
MariaDB at least depending upon the system there can be slightly better
performance).
Based on this limited amount of information, I would suspect there would
also be a problem with XFS, and that would be even _more_ sad because
XFS already excludes a truncate-vs-mmap race with the MMAPLOCK_SHARED in
__xfs_filemap_fault vs MMAPLOCK_EXCL ... somewhere in the truncate path,
I'm sure.  It's definitely there for the holepunch.

So maybe XFS should have its own implementation of filemap_fault,
or we should have a filemap_fault_locked() for filesystems which have
their own locking that excludes truncate.

Interesting, I'll fire up some cross-filesystem benchmarks with those 
tests today and report back shortly with the difference.

Michael