Re: Latest PA8800/PA8900 cache flush patch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Dave,

On 5/7/22 03:55, John David Anglin wrote:
> On 2022-05-06 6:34 p.m., John David Anglin wrote:
>> On 2022-05-06 5:30 p.m., John David Anglin wrote:
>>>> I've built 5.17.5 (32bit). Boots ok on c3000. No segfaults.
>>>> But I do see the stalls as well:
>>>> ...
>>>> Starting Avahi mDNS/DNS-SD Daemon: avahi-daemon.
>>>> Starting periodic command scheduler: cron.
>>>> [   31.472708] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
>>>> [   31.543577]  (detected by 0, t=2102 jiffies, g=7361, q=10)
>>>> [   31.609191] rcu: All QSes seen, last rcu_sched kthread activity 2102 (-22271--24373), jiffies_till_next_fqs=1, root ->qsmask 0x0
>>>> [   31.747614] rcu: rcu_sched kthread starved for 2102 jiffies! g7361 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
>>>> [   31.867313] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
>>>> [   31.974535] rcu: RCU grace-period kthread stack dump:
>>>> [   32.034962] task:rcu_sched       state:R  running task     stack:    0 pid:   10 ppid:     2 flags:0x00000000
>>>> [   32.153733] Backtrace:
>>>> [   32.181916]  [<1094c21c>] __schedule+0x2dc/0x964
>>>> [   32.237240]  [<1094c90c>] schedule+0x68/0x138
>>>> [   32.289340]  [<10953068>] schedule_timeout+0x84/0x178
>>>> [   32.349762]  [<102472b4>] rcu_gp_fqs_loop+0x32c/0x428
>>>> [   32.410186]  [<10249660>] rcu_gp_kthread+0x10c/0x1e8
>>>> [   32.469569]  [<101ebc98>] kthread+0x100/0x108
>>>> [   32.521674]  [<1019b01c>] ret_from_kernel_thread+0x1c/0x24
>>>>
>>>> ARGH!!!
>>> This was introduced by the following commit:
>>>
>>> commit d97180ad68bdb7ee10f327205a649bc2f558741d
>>> Author: Helge Deller <deller@xxxxxx>
>>> Date:   Wed Sep 8 23:27:00 2021 +0200
>>>
>>>     parisc: Mark sched_clock unstable only if clocks are not syncronized
>>>
>>>     We check at runtime if the cr16 clocks are stable across CPUs. Only mark
>>>     the sched_clock unstable by calling clear_sched_clock_stable() if we
>>>     know that we run on a system which isn't syncronized across CPUs.
>>>
>>>     Signed-off-by: Helge Deller <deller@xxxxxx>
>>>
>>> In searching for the cause, I also noticed this commit:
>>>
>>> commit e4f2006f1287e7ea17660490569cff323772dac4
>>> Author: Helge Deller <deller@xxxxxx>
>>> Date:   Tue Sep 7 05:03:29 2021 +0200
>>>
>>>     parisc: Reduce sigreturn trampoline to 3 instructions
>>>
>>>     We can move the INSN_LDI_R20 instruction into the branch delay slot.
>>>
>>>     Signed-off-by: Helge Deller <deller@xxxxxx>
>>>
>>> Changing the sigreturn trampoline breaks gdb's detection of signal frames.
>>> I suspect the INSN_LDI_R20 instruction was intentionally put before the
>>> branch to make the sequence more unique.
>>
>> It appears the latter commit has been reverted.  The former commit has been modified.
> 32bit v5.15.37 boots successfully if setup.c and time.c are reverted to v5.14.  Otherwise,
> boot stalls as above.

Thank you for investing your time to find the problem!
The mentioned patches can easily be reverted - I have queued up the revert-patches now.
It seems commit d97180ad68bdb7ee10f327205a649bc2f558741d was wrong, and the follow-up patch
made it even worse.

Ok, so we now need to find the cause why v5.18-rc crashes... :-(

Helge




[Index of Archives]     [Linux SoC]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux