On Mon, Oct 10, 2022 at 11:49 AM Nicholas Piggin <npiggin@xxxxxxxxx> wrote: > > On Thu Sep 29, 2022 at 11:48 AM AEST, Zhouyi Zhou wrote: > > On Wed, Sep 28, 2022 at 10:51 AM Nicholas Piggin <npiggin@xxxxxxxxx> wrote: > > > > > > On Wed Sep 28, 2022 at 11:48 AM AEST, Zhouyi Zhou wrote: > > > > Thank Nick for reviewing my patch > > > > > > > > On Tue, Sep 27, 2022 at 12:25 PM Nicholas Piggin <npiggin@xxxxxxxxx> wrote: > > > > > > > > > > On Tue Sep 27, 2022 at 11:48 AM AEST, Zhouyi Zhou wrote: > > > > > > This is second version of my fix to PPC's "WARNING: suspicious RCU usage", > > > > > > I improved my fix under Paul E. McKenney's guidance: > > > > > > Link: https://lore.kernel.org/lkml/20220914021528.15946-1-zhouzhouyi@xxxxxxxxx/T/ > > > > > > > > > > > > During the cpu offlining, the sub functions of xive_teardown_cpu will > > > > > > call __lock_acquire when CONFIG_LOCKDEP=y. The latter function will > > > > > > travel RCU protected list, so "WARNING: suspicious RCU usage" will be > > > > > > triggered. > > > > > > > > > > > > Avoid lockdep when we are offline. > > > > > > > > > > I don't see how this is safe. If RCU is no longer watching the CPU then > > > > > the memory it is accessing here could be concurrently freed. I think the > > > > > warning is valid. > > > > Agree > > > > > > > > > > powerpc's problem is that cpuhp_report_idle_dead() is called before > > > > > arch_cpu_idle_dead(), so it must not rely on any RCU protection there. > > > > > I would say xive cleanup just needs to be done earlier. I wonder why it > > > > > is not done in __cpu_disable or thereabouts, that's where the interrupt > > > > > controller is supposed to be stopped. > > > > Yes, I learn flowing events sequence from kgdb debugging > > > > __cpu_disable -> pseries_cpu_disable -> set_cpu_online(cpu, false) = > > > > leads to => do_idle: if (cpu_is_offline(cpu) -> arch_cpu_idle_dead > > > > so xive cleanup should be done in pseries_cpu_disable. > > > > > > It's a good catch and a reasonable approach to the problem. > > Thank Nick for your encouragement ;-) > > > > > > > But as a beginner, I afraid that I am incompetent to do above > > > > sophisticated work without error although I am very like to, > > > > Could any expert do this for us? > > > > > > This will be difficult for anybody, it's tricky code. I'm not an > > > expert at it. > > > > > > It looks like the interrupt controller disable split has been there > > > since long before xive. I would try just move them together than see > > > if that works. > > Yes, I use "git blame" (I learned "git blame" from Paul E. McKenny ;-) > > ) to see the same. > > and anticipate your great works! > > I was thinking you could try it and see if it works and what you find. > If you are interested and have time to look into it? I am interested! and I have time ;-) Thank Nick for your trust in me! I am going to submit my babyish work in about a month (counting the rcutoture tests time), and thank you in advance for your patience. Cheers Zhouyi > > Thanks, > Nick