Re: [5.4-rc1, regression] wb_workfn wakeup oops (was Re: frequent 5.4-rc1 crash?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 03, 2019 at 11:37:46AM -0700, Darrick J. Wong wrote:
> On Thu, Oct 03, 2019 at 08:05:42AM -0600, Jens Axboe wrote:
> > On 10/3/19 8:01 AM, Chris Mason wrote:
> > > 
> > > 
> > > On 3 Oct 2019, at 4:41, Gao Xiang wrote:
> > > 
> > >> Hi,
> > >>
> > >> On Thu, Oct 03, 2019 at 04:40:22PM +1000, Dave Chinner wrote:
> > >>> [cc linux-fsdevel, linux-block, tejun ]
> > >>>
> > >>> On Wed, Oct 02, 2019 at 06:52:47PM -0700, Darrick J. Wong wrote:
> > >>>> Hi everyone,
> > >>>>
> > >>>> Does anyone /else/ see this crash in generic/299 on a V4 filesystem
> > >>>> (tho
> > >>>> afaict V5 configs crash too) and a 5.4-rc1 kernel?  It seems to pop
> > >>>> up
> > >>>> on generic/299 though only 80% of the time.
> > >>>>
> > >>
> > >> Just a quick glance, I guess there could is a race between (complete
> > >> guess):
> > >>
> > >>
> > >>   160 static void finish_writeback_work(struct bdi_writeback *wb,
> > >>   161                                   struct wb_writeback_work *work)
> > >>   162 {
> > >>   163         struct wb_completion *done = work->done;
> > >>   164
> > >>   165         if (work->auto_free)
> > >>   166                 kfree(work);
> > >>   167         if (done && atomic_dec_and_test(&done->cnt))
> > >>
> > >>   ^^^ here
> > >>
> > >>   168                 wake_up_all(done->waitq);
> > >>   169 }
> > >>
> > >> since new wake_up_all(done->waitq); is completely on-stack,
> > >>   	if (done && atomic_dec_and_test(&done->cnt))
> > >> -		wake_up_all(&wb->bdi->wb_waitq);
> > >> +		wake_up_all(done->waitq);
> > >>   }
> > >>
> > >> which could cause use after free if on-stack wb_completion is gone...
> > >> (however previous wb->bdi is solid since it is not on-stack)
> > >>
> > >> see generic on-stack completion which takes a wait_queue spin_lock
> > >> between
> > >> test and wake_up...
> > >>
> > >> If I am wrong, ignore me, hmm...
> > > 
> > > It's a good guess ;)  Jens should have this queued up already:
> > > 
> > > https://lkml.org/lkml/2019/9/23/972
> > 
> > Yes indeed, it'll go out today or tomorrow for -rc2.
> 
> The patch fixes the problems I've been seeing, so:
> Tested-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> 
> Thank you for taking care of this. :)

Hmm, I don't see this patch in -rc2; did it not go out in time, or were
there further complications?

--D

> --D
> 
> > -- 
> > Jens Axboe
> > 



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux