Hi Paul,
Regarding cache flushing in machine_kexec(), I recommend simply removing
__flush_cache_all() as CPU0 will do local icache flush before jumping to
reboot_code_buffer, and other CPUs don't jump at all.
Re: marking CPU offline in kexec_this_cpu(), it's probably good NOT doing
it either, as the system is going to reboot from CPU0 soon. HALT is good
enough, no need to gate core power. As to whether it's safe running
play_dead() in parallel, it shouldn't be a problem, as the loop is based
on cpu online mask (which we don't mark offline as mentioned above) -- The
CPU will simply halt itself. For non-MT CPUs, play_dead() does make sense
as well, as it's supposed to stop this CPU's execution, getting ready to
reboot from CPU0.
On UHI stuff, if it's true "future platforms will just use
arch/mips/generic", then certainly no need of getting code out of Generic.
My feeling is the customer is NOT doing so, according to "I fixed it by
moving the code to another file". Other MIPS systems may be like this as
well. I understand your "upstream-or-copy-code" point, but the only thing,
which IMO meaningfully justifies patch 6 *for_such_cases*, is already said:
-----------------
When the kernel developer of an out-of-tree platform
wants to make kexec work, they will naturally look at machine_kexec.c,
where they will find this UHI stuff, obviously telling them to do
"select UHI_BOOT". Otherwise, unless they google onto this discussion
thread, it's harder to know the solution to the kexec_args related
problem hides in the code of another platform (Generic).
-----------------
In other words, it may take considerable efforts to realize "copying the
generic_kexec_prepare() function" can solve the problem. Copying code itself
is certainly not onerous.
Regards,
Dengcheng
On 09/06/2018 01:34 PM, Paul Burton wrote:
Hi Dengcheng,
On Thu, Sep 06, 2018 at 12:19:49PM -0700, Dengcheng Zhu wrote:
I didn't apply patch 4 because I'm not sure it's correct & I believe the
changes in the branch above should take care of it - CPUs that reach
kexec_this_cpu() are maked offline, so they shouldn't be IPI'd by
__flush_cache_all().
I believe patch 4 is necessary. As mentioned in the code comment and patch
comment of that patch, machine_crash_shutdown() is called prior to
machine_kexec() in the kdump sequence. So other CPUs have disabled local
IRQs waiting for the reboot signal.
In fact, in kexec_this_cpu() [you renamed and modified kexec_smp_reboot()],
the added marking CPU offline will cause system hang (tested). This is
because it will change how play_dead() will work.
So the problem is that patch 4 isn't really right either, and I suspect
the hang you mention probably shows a problem with the whole play_dead()
approach from patches 1 & 2 too.
On the cache flushing, what we really need is for stores performed by
the CPU running machine_kexec() via its dcache to be observed by the
icache of the CPU that jumps into reboot_code_buffer, which is always
CPU 0 after patch 2. Using local_flush_icache_range() will only ensure
that the icache of the CPU running machine_kexec() observes the changes
made by that same CPU, and does nothing with CPU 0 unless they happen to
be the same CPU or siblings sharing caches.
Consider I6x00 CPUs for example - patch 4 may *almost* work OK because
both cpu_has_ic_fills_f_dc & cpu_icache_snoops_remote_store are true, so
when the machine_kexec() CPU writes to reboot_code_buffer CPU 0's icache
will observe that & will fill from dcache without needing it to be
written back to L2. It's still not quite right because if CPU 0's icache
already contained any valid lines within reboot_code_buffer (which could
happen speculatively) then it'll execute stale/garbage code.
The hang you mention is likely down to the fact that play_dead(), at
least the smp-cps.c implementation of it, is written for the CPU hotplug
infrastructure which will only operate on one CPU at a time. The loop
looking for a sibling CPU just isn't going to be safe to run on multiple
CPUs in parallel like patch 2 does. That's true of either your original
patch or my modification - it's just that my modified patch will cause
siblings to race towards powering off the core, whilst your original
will cause them to all halt themselves & never power off the core.
Which inclines me to return to my earlier thought that perhaps we
shouldn't use play_dead() for this at all.
The CPU that runs machine_kexec() should still
flush its dcache (& the L2), and then CPU 0 invalidates its icache in
kexec_this_cpu() prior to jumping into reboot_code_buffer.
I'm also still not sure about patch 6 - since no platforms besides the
arch/mips/generic/ make use of the UHI boot code yet I think it'd be
best to leave as-is. If we do ever need to use it from another platform
then we can deal with the problem then. If an out of tree platform needs
to use it then for now it could copy generic_kexec_prepare() and deal
with removing the duplication when it heads upstream.
Understood. It really depends on how this problem is viewed. If patch #6
is considered creating a framework for future UHI platforms, then it has
the following facts:
* It doesn't create code redundancy. I mean, it does not add unnecessary
code to the kernel.
* Out of tree platforms will get access to this functionality by a simple
"select UHI_BOOT". When the kernel developer of an out-of-tree platform
wants to make kexec work, they will naturally look at machine_kexec.c,
where they will find this UHI stuff, obviously telling them to do
"select UHI_BOOT". Otherwise, unless they google onto this discussion
thread, it's harder to know the solution to the kexec_args related
problem hides in the code of another platform (Generic).
* It simplifies work if the out of tree platform wants to upstream.
Well, ideally future platforms will just use arch/mips/generic rather
than adding more platform/board code at all. Right now the only reason
not to is if a system has a memory map that doesn't fit nicely, which
should be resolved at some point by mapping the kernel which will allow
the generic kernel to better support differing physical address maps.
So I hope we don't add more platform code anyway, which would make all
this moot. I certainly don't want to encourage adding more by explicitly
making it easier to do. And if there does come a point where we need to
add more platform code for some good reason then we can deal with this
at that point rather than speculating now.
For out of tree platforms, I don't think that copying the
generic_kexec_prepare() function is particularly onerous anyway, and
should be trivial to remove if such code is submitted upstream. Whilst I
wouldn't want to make submitting code upstream more difficult, it's
likely to need some amount of rework if submitted upstream anyway so
this doesn't seem like a big deal.
I don't think upstream should be particularly concerned with making life
easy for out of tree code - if one wants the benefit of their code being
taken into account upstream, then one should submit one's code upstream.
In the paraphrased words of my pastor at a marriage prep class - no
benefits (except those already conferred by free access to open source
software) without commitment.
Thanks,
Paul
Dengcheng
---------------------------------------------------------------------------
*From:* Paul Burton <mailto:paul.burton@xxxxxxxx>
*Sent:* Thursday, September 06, 2018 1:34PM
*To:* Dengcheng Zhu <mailto:dzhu@xxxxxxxxxxxx>
*Cc:* Pburton <mailto:pburton@xxxxxxxxxxxx>, Ralf
<mailto:ralf@xxxxxxxxxxxxxx>, Linux-mips
<mailto:linux-mips@xxxxxxxxxxxxxx>, Rachel.mozes
<mailto:rachel.mozes@xxxxxxxxx>
*Subject:* Re: [PATCH v4 0/6] MIPS: kexec/kdump: Fix smp reboot and other
issues
Hi Dengcheng,
On Thu, Sep 06, 2018 at 12:19:49PM -0700, Dengcheng Zhu wrote:
I didn't apply patch 4 because I'm not sure it's correct & I believe the
changes in the branch above should take care of it - CPUs that reach
kexec_this_cpu() are maked offline, so they shouldn't be IPI'd by
__flush_cache_all().
I believe patch 4 is necessary. As mentioned in the code comment and patch
comment of that patch, machine_crash_shutdown() is called prior to
machine_kexec() in the kdump sequence. So other CPUs have disabled local
IRQs waiting for the reboot signal.
In fact, in kexec_this_cpu() [you renamed and modified kexec_smp_reboot()],
the added marking CPU offline will cause system hang (tested). This is
because it will change how play_dead() will work.
So the problem is that patch 4 isn't really right either, and I suspect
the hang you mention probably shows a problem with the whole play_dead()
approach from patches 1 & 2 too.
On the cache flushing, what we really need is for stores performed by
the CPU running machine_kexec() via its dcache to be observed by the
icache of the CPU that jumps into reboot_code_buffer, which is always
CPU 0 after patch 2. Using local_flush_icache_range() will only ensure
that the icache of the CPU running machine_kexec() observes the changes
made by that same CPU, and does nothing with CPU 0 unless they happen to
be the same CPU or siblings sharing caches.
Consider I6x00 CPUs for example - patch 4 may *almost* work OK because
both cpu_has_ic_fills_f_dc & cpu_icache_snoops_remote_store are true, so
when the machine_kexec() CPU writes to reboot_code_buffer CPU 0's icache
will observe that & will fill from dcache without needing it to be
written back to L2. It's still not quite right because if CPU 0's icache
already contained any valid lines within reboot_code_buffer (which could
happen speculatively) then it'll execute stale/garbage code.
The hang you mention is likely down to the fact that play_dead(), at
least the smp-cps.c implementation of it, is written for the CPU hotplug
infrastructure which will only operate on one CPU at a time. The loop
looking for a sibling CPU just isn't going to be safe to run on multiple
CPUs in parallel like patch 2 does. That's true of either your original
patch or my modification - it's just that my modified patch will cause
siblings to race towards powering off the core, whilst your original
will cause them to all halt themselves & never power off the core.
Which inclines me to return to my earlier thought that perhaps we
shouldn't use play_dead() for this at all.
The CPU that runs machine_kexec() should still
flush its dcache (& the L2), and then CPU 0 invalidates its icache in
kexec_this_cpu() prior to jumping into reboot_code_buffer.
I'm also still not sure about patch 6 - since no platforms besides the
arch/mips/generic/ make use of the UHI boot code yet I think it'd be
best to leave as-is. If we do ever need to use it from another platform
then we can deal with the problem then. If an out of tree platform needs
to use it then for now it could copy generic_kexec_prepare() and deal
with removing the duplication when it heads upstream.
Understood. It really depends on how this problem is viewed. If patch #6
is considered creating a framework for future UHI platforms, then it has
the following facts:
* It doesn't create code redundancy. I mean, it does not add unnecessary
code to the kernel.
* Out of tree platforms will get access to this functionality by a simple
"select UHI_BOOT". When the kernel developer of an out-of-tree platform
wants to make kexec work, they will naturally look at machine_kexec.c,
where they will find this UHI stuff, obviously telling them to do
"select UHI_BOOT". Otherwise, unless they google onto this discussion
thread, it's harder to know the solution to the kexec_args related
problem hides in the code of another platform (Generic).
* It simplifies work if the out of tree platform wants to upstream.
Well, ideally future platforms will just use arch/mips/generic rather
than adding more platform/board code at all. Right now the only reason
not to is if a system has a memory map that doesn't fit nicely, which
should be resolved at some point by mapping the kernel which will allow
the generic kernel to better support differing physical address maps.
So I hope we don't add more platform code anyway, which would make all
this moot. I certainly don't want to encourage adding more by explicitly
making it easier to do. And if there does come a point where we need to
add more platform code for some good reason then we can deal with this
at that point rather than speculating now.
For out of tree platforms, I don't think that copying the
generic_kexec_prepare() function is particularly onerous anyway, and
should be trivial to remove if such code is submitted upstream. Whilst I
wouldn't want to make submitting code upstream more difficult, it's
likely to need some amount of rework if submitted upstream anyway so
this doesn't seem like a big deal.
I don't think upstream should be particularly concerned with making life
easy for out of tree code - if one wants the benefit of their code being
taken into account upstream, then one should submit one's code upstream.
In the paraphrased words of my pastor at a marriage prep class - no
benefits (except those already conferred by free access to open source
software) without commitment.
Thanks,
Paul