Re: Latest PA8800/PA8900 cache flush patch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2022-04-27 5:21 p.m., Helge Deller wrote:
On 4/27/22 23:08, Helge Deller wrote:
On 4/27/22 23:04, Helge Deller wrote:
On 4/27/22 22:50, John David Anglin wrote:
On 2022-04-27 4:44 p.m., John David Anglin wrote:
On 2022-04-26 4:43 p.m., Helge Deller wrote:
I have removed the flush_cache_dup_mm code as it improve perforance
and hopefully it will fix issues on B160L.
I applied your patch on top of for-next tree.
I still see the same issue on the C3700 (PA8700 (PCX-W2)) with 32bit kernel...
Maybe it's not related to your cache flush patches but to mine?

Boot fails with your 32-bit config on c3750 with latest cache patch.  So, problem appears
Bah, I meant "without".
to have been introduced earlier.
Yes, happens before for-next tree and your patches.

I still have install problem with 5.17.0-1-parisc64.
So, you run 5.17 (debian) and it is unstable? I'll try, but currently again my time is limited.
Debian 5.17 is based on stable-5.17-3.
This version includes this patch:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.17.y&id=e115f5a44360c4a2f158074ecb3feea88c45fdc0
Greg pulled it down from 5.18-rc...
Maybe that's the issue?
I've built 5.17.5 (32bit). Boots ok on c3000. No segfaults.
But I do see the stalls as well:
...
Starting Avahi mDNS/DNS-SD Daemon: avahi-daemon.
Starting periodic command scheduler: cron.
[   31.472708] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[   31.543577]  (detected by 0, t=2102 jiffies, g=7361, q=10)
[   31.609191] rcu: All QSes seen, last rcu_sched kthread activity 2102 (-22271--24373), jiffies_till_next_fqs=1, root ->qsmask 0x0
[   31.747614] rcu: rcu_sched kthread starved for 2102 jiffies! g7361 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[   31.867313] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[   31.974535] rcu: RCU grace-period kthread stack dump:
[   32.034962] task:rcu_sched       state:R  running task     stack:    0 pid:   10 ppid:     2 flags:0x00000000
[   32.153733] Backtrace:
[   32.181916]  [<1094c21c>] __schedule+0x2dc/0x964
[   32.237240]  [<1094c90c>] schedule+0x68/0x138
[   32.289340]  [<10953068>] schedule_timeout+0x84/0x178
[   32.349762]  [<102472b4>] rcu_gp_fqs_loop+0x32c/0x428
[   32.410186]  [<10249660>] rcu_gp_kthread+0x10c/0x1e8
[   32.469569]  [<101ebc98>] kthread+0x100/0x108
[   32.521674]  [<1019b01c>] ret_from_kernel_thread+0x1c/0x24

ARGH!!!
This was introduced by the following commit:

commit d97180ad68bdb7ee10f327205a649bc2f558741d
Author: Helge Deller <deller@xxxxxx>
Date:   Wed Sep 8 23:27:00 2021 +0200

    parisc: Mark sched_clock unstable only if clocks are not syncronized

    We check at runtime if the cr16 clocks are stable across CPUs. Only mark
    the sched_clock unstable by calling clear_sched_clock_stable() if we
    know that we run on a system which isn't syncronized across CPUs.

    Signed-off-by: Helge Deller <deller@xxxxxx>

In searching for the cause, I also noticed this commit:

commit e4f2006f1287e7ea17660490569cff323772dac4
Author: Helge Deller <deller@xxxxxx>
Date:   Tue Sep 7 05:03:29 2021 +0200

    parisc: Reduce sigreturn trampoline to 3 instructions

    We can move the INSN_LDI_R20 instruction into the branch delay slot.

    Signed-off-by: Helge Deller <deller@xxxxxx>

Changing the sigreturn trampoline breaks gdb's detection of signal frames.
I suspect the INSN_LDI_R20 instruction was intentionally put before the
branch to make the sequence more unique.

Dave

--
John David Anglin  dave.anglin@xxxxxxxx




[Index of Archives]     [Linux SoC]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux