On 2022-04-27 5:21 p.m., Helge Deller wrote:
On 4/27/22 23:08, Helge Deller wrote:
On 4/27/22 23:04, Helge Deller wrote:
On 4/27/22 22:50, John David Anglin wrote:
On 2022-04-27 4:44 p.m., John David Anglin wrote:
On 2022-04-26 4:43 p.m., Helge Deller wrote:
I have removed the flush_cache_dup_mm code as it improve perforance
and hopefully it will fix issues on B160L.
I applied your patch on top of for-next tree.
I still see the same issue on the C3700 (PA8700 (PCX-W2)) with 32bit kernel...
Maybe it's not related to your cache flush patches but to mine?
Boot fails with your 32-bit config on c3750 with latest cache patch. So, problem appears
Bah, I meant "without".
to have been introduced earlier.
Yes, happens before for-next tree and your patches.
I still have install problem with 5.17.0-1-parisc64.
So, you run 5.17 (debian) and it is unstable? I'll try, but currently again my time is limited.
Debian 5.17 is based on stable-5.17-3.
This version includes this patch:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.17.y&id=e115f5a44360c4a2f158074ecb3feea88c45fdc0
Greg pulled it down from 5.18-rc...
Maybe that's the issue?
I've built 5.17.5 (32bit). Boots ok on c3000. No segfaults.
But I do see the stalls as well:
...
Starting Avahi mDNS/DNS-SD Daemon: avahi-daemon.
Starting periodic command scheduler: cron.
[ 31.472708] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 31.543577] (detected by 0, t=2102 jiffies, g=7361, q=10)
[ 31.609191] rcu: All QSes seen, last rcu_sched kthread activity 2102 (-22271--24373), jiffies_till_next_fqs=1, root ->qsmask 0x0
[ 31.747614] rcu: rcu_sched kthread starved for 2102 jiffies! g7361 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 31.867313] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[ 31.974535] rcu: RCU grace-period kthread stack dump:
[ 32.034962] task:rcu_sched state:R running task stack: 0 pid: 10 ppid: 2 flags:0x00000000
[ 32.153733] Backtrace:
[ 32.181916] [<1094c21c>] __schedule+0x2dc/0x964
[ 32.237240] [<1094c90c>] schedule+0x68/0x138
[ 32.289340] [<10953068>] schedule_timeout+0x84/0x178
[ 32.349762] [<102472b4>] rcu_gp_fqs_loop+0x32c/0x428
[ 32.410186] [<10249660>] rcu_gp_kthread+0x10c/0x1e8
[ 32.469569] [<101ebc98>] kthread+0x100/0x108
[ 32.521674] [<1019b01c>] ret_from_kernel_thread+0x1c/0x24
ARGH!!!
This was introduced by the following commit:
commit d97180ad68bdb7ee10f327205a649bc2f558741d
Author: Helge Deller <deller@xxxxxx>
Date: Wed Sep 8 23:27:00 2021 +0200
parisc: Mark sched_clock unstable only if clocks are not syncronized
We check at runtime if the cr16 clocks are stable across CPUs. Only mark
the sched_clock unstable by calling clear_sched_clock_stable() if we
know that we run on a system which isn't syncronized across CPUs.
Signed-off-by: Helge Deller <deller@xxxxxx>
In searching for the cause, I also noticed this commit:
commit e4f2006f1287e7ea17660490569cff323772dac4
Author: Helge Deller <deller@xxxxxx>
Date: Tue Sep 7 05:03:29 2021 +0200
parisc: Reduce sigreturn trampoline to 3 instructions
We can move the INSN_LDI_R20 instruction into the branch delay slot.
Signed-off-by: Helge Deller <deller@xxxxxx>
Changing the sigreturn trampoline breaks gdb's detection of signal frames.
I suspect the INSN_LDI_R20 instruction was intentionally put before the
branch to make the sequence more unique.
Dave
--
John David Anglin dave.anglin@xxxxxxxx