Re: Slowdown with kernel 4.18.0

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2018-08-24 9:27 AM, Rolf Eike Beer wrote:
Am 2018-08-24 14:09, schrieb Helge Deller:
> On 2018-08-21 2:31 PM, Rolf Eike Beer wrote:

With this patch I get this timing:

gcc7 - Time: 2018-08-24T01:42:07
gcc6 - Time: 2018-08-24T06:01:27

Somewhere in between 4.17.3 and 4.18.0.

Rolf, I think plain 4.18 kernel misses Dave's speed-up patches:
*
http://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7797167ffde1f00446301cb22b37b7c03194cfaf
*
http://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3b885ac1dc35b87a39ee176a6c7e2af9c789d8b8
Both patches have been scheduled to be added to 4.18-stable kernel.

I guess I better revert the previous patch from Dave, no?
I would say leave the patch.  The big part of the slow down is the sync barrier in the TLB handler.  The above patches don't address this issue.  They should speed up spin locks in general.

On PA 2.0 SMP machines, we need either a sync or an ordered store to release a spin lock.  Otherwise, the lock may be released before the other accesses in the lock region are complete.  As a result, the operation isn't atomic from the perspective of other CPUs.  There's no getting around this issue on PA 2.0 systems.

I plan to look more at using ordered loads and stores in the spin lock code as they clearly don't impact performance
as much as sync.

Regarding the TLB code, it turned out we were always setting the page accessed bit for user pages.  So, the code to set it when a user page is accessed is redundant.  We need to lock to update the accessed and dirty bits atomically.  We can keep the current TLB locking code and not set the page accessed bit in our user page defines.  This should improve swap but the TLB handler is more complex.  Another alternative is to remove the locking and accessed update code from the TLB handler.  This provides the best performance for TLB inserts but swap performance will be worse since we don't track the accessed bit.

Dave

--
John David Anglin  dave.anglin@xxxxxxxx




[Index of Archives]     [Linux SoC]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux