Re: [RFC][PATCH v2] parisc: Add alternative coding when running UP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 16.10.2018 14:08, John David Anglin wrote:
> On 2018-10-16 1:34 AM, Helge Deller wrote:
>> On 15.10.2018 23:11, James Bottomley wrote:
>>> On Sun, 2018-10-14 at 20:34 +0200, Helge Deller wrote:
>>>> This patch adds the necessary code to patch a running SMP kernel
>>>> at runtime to improve performance when running on a single CPU.
>>>>
>>>> The current implementation offers two patching variants:
>>>> - Unwanted assembler statements like locking functions are
>>>> overwritten
>>>>    with NOPs. When multiple instructions shall be skipped, one branch
>>>>    instruction is used instead of multiple nop instructions.
>>> This seems like a good idea because our spinlocks are particularly
>>> heavyweight.
>>>
>>>> - Some pdtlb and pitlb instructions are patched to become pdtlb,l and
>>>>    pitlb,l which only flushes the CPU-local tlb entries instead of
>>>>    broadcasting the flush to other CPUs in the system and thus may
>>>>    improve performance.
>>> I really don't think this matters: on a UP system, ptdlb,l and pdtlb
>>> are the same instruction because the CPU already knows is has no
>>> internal CPU bus to broadcast the purge over so it in effect executes a
>>> pdtlb,l regardless.
>> I'd be happy to drop this part again.
>> But is that true on a SMP system, where one has booted with maxcpus=1, too?
> I would like to see what happens on panama.  Panama is a rp3410. Currently, it takes
> approximately 4042 cycles to flush one page (4096 bytes).  This is way more than the number
> of cycles that I see on my rp3440.  My c3750 takes 450 cycles per page with patch.  It could
> be ptdlb,l and pdtlb are equivalent on c3750.

Depends on what you flush.
On c3750 we may get fooled because the kernel area could have been mapped via huge pages,
while on rp34x0 the PA8900 CPU prevents huge pages for kernel.
That may explain the performance difference between c3750 and rp3410, but not
the difference to rp3440.

> Is there something wrong with SMP on panama?
> Oct  4 02:27:56 panama kernel: [    0.061736] smp: Bringing up secondary CPUs ...
> Oct  4 02:27:56 panama kernel: [    0.061897] smp: Brought up 3 nodes, 1 CPU

Will check tomorrow.
 
> I know replacing "sync and normal store" with ordered store in spin lock release makes a
> significant difference in the above timing.  Plan to send patch tonight.

What exactly do you want me to test on panama?
Is the git head with my latest for-next tree [1] OK ?

Helge

[1] https://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux.git/log/?h=for-next



[Index of Archives]     [Linux SoC]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux