* Christophe Leroy <christophe.leroy@xxxxxxxxxx> [230912 11:27]: > > > Le 12/09/2023 à 17:08, Paul E. McKenney a écrit : > > On Tue, Sep 12, 2023 at 10:29:30AM -0400, Liam R. Howlett wrote: > >> * Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> [230912 09:56]: > >>> * Paul E. McKenney <paulmck@xxxxxxxxxx> [230912 06:00]: > >>>> On Tue, Sep 12, 2023 at 10:34:44AM +0200, Geert Uytterhoeven wrote: > >>>>> Hi Paul, > >>>>> > >>>>> On Tue, Sep 12, 2023 at 10:30 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: > >>>>>> On Tue, Sep 12, 2023 at 10:23:37AM +0200, Geert Uytterhoeven wrote: > >>>>>>> On Tue, Sep 12, 2023 at 10:14 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: > >>>>>>>> On Mon, Sep 11, 2023 at 07:54:52PM -0400, Liam R. Howlett wrote: > >>>>>>>>> * Paul E. McKenney <paulmck@xxxxxxxxxx> [230906 14:03]: > >>>>>>>>>> On Wed, Sep 06, 2023 at 01:29:54PM -0400, Liam R. Howlett wrote: > >>>>>>>>>>> * Paul E. McKenney <paulmck@xxxxxxxxxx> [230906 13:24]: > >>>>>>>>>>>> On Wed, Sep 06, 2023 at 11:23:25AM -0400, Liam R. Howlett wrote: > >>>>>>>>>>>>> (Adding Paul & Shanker to Cc list.. please see below for why) > >>>>>>>>>>>>> > >>>>>>>>>>>>> Apologies on the late response, I was away and have been struggling to > >>>>>>>>>>>>> get a working PPC32 test environment. > >>>>>>>>>>>>> > >>>>>>>>>>>>> * Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> [230829 12:42]: > >>>>>>>>>>>>>> Hi Liam, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Fri, 18 Aug 2023, Liam R. Howlett wrote: > >>>>>>>>>>>>>>> The current implementation of append may cause duplicate data and/or > >>>>>>>>>>>>>>> incorrect ranges to be returned to a reader during an update. Although > >>>>>>>>>>>>>>> this has not been reported or seen, disable the append write operation > >>>>>>>>>>>>>>> while the tree is in rcu mode out of an abundance of caution. > >>>>>>>>>>>>> > >>>>>>>>>>>>> ... > >>>>>>>>>>>>>>> > >>>>>>>>> > >>>>>>>>> ... > >>>>>>>>> > >>>>>>>>>>>>>> RCU-related configs: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> $ grep RCU .config > >>>>>>>>>>>>>> # RCU Subsystem > >>>>>>>>>>>>>> CONFIG_TINY_RCU=y > >>>>>>>> > >>>>>>>> I must have been asleep last time I looked at this. I was looking at > >>>>>>>> Tree RCU. Please accept my apologies for my lapse. :-/ > >>>>>>>> > >>>>>>>> However, Tiny RCU's call_rcu() also avoids enabling IRQs, so I would > >>>>>>>> have said the same thing, albeit after looking at a lot less RCU code. > >>>>>>>> > >>>>>>>> TL;DR: > >>>>>>>> > >>>>>>>> 1. Try making the __setup_irq() function's call to mutex_lock() > >>>>>>>> instead be as follows: > >>>>>>>> > >>>>>>>> if (!mutex_trylock(&desc->request_mutex)) > >>>>>>>> mutex_lock(&desc->request_mutex); > >>>>>>>> > >>>>>>>> This might fail if __setup_irq() has other dependencies on a > >>>>>>>> fully operational scheduler. > >> > >> This changes where the interrupts become enabled, but doesn't stop it > >> from happening. It still throws a WARN after init_IRQ(). I suspect it > >> is not the way to proceed as there are probably many places that will > >> need to be changed in both common and arch specific code. As soon as > >> TIF_NEED_RESCHED is set, then all the checks will need to be altered. > > > > Thank you for trying it! > > > >> I think we either need to set the boot thread to !idle, avoid call_rcu() > >> to set TIF_NEED_RESCHED (how was this working before? Do we need rcu > >> for the IRQs?), or alter the boot order (note this is NOT arch or > >> platform code here). > >> > >> I don't like any of these. I'd like another option, please? > > > > My favorite is to move the interrupt enabling later, but Michael Ellerman > > would know better than would I about the feasibility of this. > > > > I digged into it a bit more, looks like IRQs get enabled by the call to > cond_resched() in the loop in vm_area_alloc_pages(), which is called > from powerpc's init_IRQ() fonction when allocating IRQ stacks. This is another location where the process will enable IRQs because TIF_NEED_RESCHED was set. > > And IRQ stacks definitely need to be enabled before IRQs get enabled, so > there's something wrong here isn't it ?