Re: [PATCH v2 1/2] maple_tree: Disable mas_wr_append() when other readers are possible

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Christophe Leroy <christophe.leroy@xxxxxxxxxx> [230912 11:27]:
> 
> 
> Le 12/09/2023 à 17:08, Paul E. McKenney a écrit :
> > On Tue, Sep 12, 2023 at 10:29:30AM -0400, Liam R. Howlett wrote:
> >> * Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> [230912 09:56]:
> >>> * Paul E. McKenney <paulmck@xxxxxxxxxx> [230912 06:00]:
> >>>> On Tue, Sep 12, 2023 at 10:34:44AM +0200, Geert Uytterhoeven wrote:
> >>>>> Hi Paul,
> >>>>>
> >>>>> On Tue, Sep 12, 2023 at 10:30 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> >>>>>> On Tue, Sep 12, 2023 at 10:23:37AM +0200, Geert Uytterhoeven wrote:
> >>>>>>> On Tue, Sep 12, 2023 at 10:14 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> >>>>>>>> On Mon, Sep 11, 2023 at 07:54:52PM -0400, Liam R. Howlett wrote:
> >>>>>>>>> * Paul E. McKenney <paulmck@xxxxxxxxxx> [230906 14:03]:
> >>>>>>>>>> On Wed, Sep 06, 2023 at 01:29:54PM -0400, Liam R. Howlett wrote:
> >>>>>>>>>>> * Paul E. McKenney <paulmck@xxxxxxxxxx> [230906 13:24]:
> >>>>>>>>>>>> On Wed, Sep 06, 2023 at 11:23:25AM -0400, Liam R. Howlett wrote:
> >>>>>>>>>>>>> (Adding Paul & Shanker to Cc list.. please see below for why)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Apologies on the late response, I was away and have been struggling to
> >>>>>>>>>>>>> get a working PPC32 test environment.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> * Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> [230829 12:42]:
> >>>>>>>>>>>>>>      Hi Liam,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Fri, 18 Aug 2023, Liam R. Howlett wrote:
> >>>>>>>>>>>>>>> The current implementation of append may cause duplicate data and/or
> >>>>>>>>>>>>>>> incorrect ranges to be returned to a reader during an update.  Although
> >>>>>>>>>>>>>>> this has not been reported or seen, disable the append write operation
> >>>>>>>>>>>>>>> while the tree is in rcu mode out of an abundance of caution.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> ...
> >>>>>>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> ...
> >>>>>>>>>
> >>>>>>>>>>>>>> RCU-related configs:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>      $ grep RCU .config
> >>>>>>>>>>>>>>      # RCU Subsystem
> >>>>>>>>>>>>>>      CONFIG_TINY_RCU=y
> >>>>>>>>
> >>>>>>>> I must have been asleep last time I looked at this.  I was looking at
> >>>>>>>> Tree RCU.  Please accept my apologies for my lapse.  :-/
> >>>>>>>>
> >>>>>>>> However, Tiny RCU's call_rcu() also avoids enabling IRQs, so I would
> >>>>>>>> have said the same thing, albeit after looking at a lot less RCU code.
> >>>>>>>>
> >>>>>>>> TL;DR:
> >>>>>>>>
> >>>>>>>> 1.      Try making the __setup_irq() function's call to mutex_lock()
> >>>>>>>>          instead be as follows:
> >>>>>>>>
> >>>>>>>>          if (!mutex_trylock(&desc->request_mutex))
> >>>>>>>>                  mutex_lock(&desc->request_mutex);
> >>>>>>>>
> >>>>>>>>          This might fail if __setup_irq() has other dependencies on a
> >>>>>>>>          fully operational scheduler.
> >>
> >> This changes where the interrupts become enabled, but doesn't stop it
> >> from happening.  It still throws a WARN after init_IRQ(). I suspect it
> >> is not the way to proceed as there are probably many places that will
> >> need to be changed in both common and arch specific code.  As soon as
> >> TIF_NEED_RESCHED is set, then all the checks will need to be altered.
> > 
> > Thank you for trying it!
> > 
> >> I think we either need to set the boot thread to !idle, avoid call_rcu()
> >> to set TIF_NEED_RESCHED (how was this working before?  Do we need rcu
> >> for the IRQs?), or alter the boot order (note this is NOT arch or
> >> platform code here).
> >>
> >> I don't like any of these.  I'd like another option, please?
> > 
> > My favorite is to move the interrupt enabling later, but Michael Ellerman
> > would know better than would I about the feasibility of this.
> > 
> 
> I digged into it a bit more, looks like IRQs get enabled by the call to 
> cond_resched() in the loop in vm_area_alloc_pages(), which is called 
> from powerpc's init_IRQ() fonction when allocating IRQ stacks.

This is another location where the process will enable IRQs because
TIF_NEED_RESCHED was set.

> 
> And IRQ stacks definitely need to be enabled before IRQs get enabled, so 
> there's something wrong here isn't it ?





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux