Zhouyi Zhou <zhouzhouyi@xxxxxxxxx> writes: > I think we should avoid torture offline the cpu who do tick timer > when nohz full is running. Can you tell us what the bug you're fixing is? Did you see a crash/oops/hang etc? Or are you just proposing this as something that would be a good idea? > Tested on PPC VM of Open Source Lab of Oregon State University. > The test results show that after the fix, the success rate of > rcutorture is improved. > After: > Successes: 40 Failures: 9 > Before: > Successes: 38 Failures: 11 > > I examined the console.log and Make.out files one by one, no new > compile error or test error is introduced by above fix. > > Signed-off-by: Zhouyi Zhou <zhouzhouyi@xxxxxxxxx> > --- > Dear PPC developers > > I found this bug when trying to do rcutorture tests in ppc VM of > Open Source Lab of Oregon State University: > > ubuntu@ubuntu:~/linux-next/tools/testing/selftests/rcutorture/res/2022.09.30-01.06.22-torture$ find . -name "console.log.diags"|xargs grep HOTPLUG > ./results-scftorture/NOPREEMPT/console.log.diags:WARNING: HOTPLUG FAILURES NOPREEMPT > ./results-rcutorture/TASKS03/console.log.diags:WARNING: HOTPLUG FAILURES TASKS03 > ./results-rcutorture/TREE04/console.log.diags:WARNING: HOTPLUG FAILURES TREE04 > ./results-scftorture-kasan/NOPREEMPT/console.log.diags:WARNING: HOTPLUG FAILURES NOPREEMPT > ./results-rcutorture-kasan/TASKS03/console.log.diags:WARNING: HOTPLUG FAILURES TASKS03 > ./results-rcutorture-kasan/TREE04/console.log.diags:WARNING: HOTPLUG FAILURES TREE04 > > I tried to fix this bug. > > Thanks for your patience and guidance ;-) > > Thanks > Zhouyi > -- > arch/powerpc/kernel/sysfs.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c > index ef9a61718940..be9c0e45337e 100644 > --- a/arch/powerpc/kernel/sysfs.c > +++ b/arch/powerpc/kernel/sysfs.c > @@ -4,6 +4,7 @@ > #include <linux/smp.h> > #include <linux/percpu.h> > #include <linux/init.h> > +#include <linux/tick.h> > #include <linux/sched.h> > #include <linux/export.h> > #include <linux/nodemask.h> > @@ -21,6 +22,7 @@ > #include <asm/firmware.h> > #include <asm/idle.h> > #include <asm/svm.h> > +#include "../../../kernel/time/tick-internal.h" Needing to include this internal header is a sign that we are using the wrong API or otherwise using time keeping internals we shouldn't be. > #include "cacheinfo.h" > #include "setup.h" > @@ -1151,7 +1153,11 @@ static int __init topology_init(void) > * CPU. For instance, the boot cpu might never be valid > * for hotplugging. > */ > - if (smp_ops && smp_ops->cpu_offline_self) > + if (smp_ops && smp_ops->cpu_offline_self > +#ifdef CONFIG_NO_HZ_FULL > + && !(tick_nohz_full_running && tick_do_timer_cpu == cpu) > +#endif > + ) I can't see any other arches doing anything like this. I don't think it's the arches responsibility. If the time keeping core needs a CPU to stay online to run the timer then it needs to organise that itself IMHO :) cheers > c->hotpluggable = 1; > #endif > > -- > 2.25.1