Re: AW: [PATCH] md/bcache: Fix a deadlock while calculating writeback rate

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 10 May 2017, Phillipp Röll wrote:

> Hi Eric,
> 
> > Thank you for tracking this down!  Did you have an actual deadlock case that
> > this fixed?
> 
> Yes. We had at least dozens, if not hundreds, of "kernel hung task" events
> followed by complete IO lockups on our production systems, with traces like
> this: https://paste42.de/059f78f7f9f27e9d88cd65fc02811e92/11749/
> 
> Reproduction is actually pretty easy:
> 
> - Setup bcache with SSD + HDD and LVM on top of the resulting bcache device
> - Create three LVs in the new cached VG
> - Start dd'ing from one of these LVs to another LV
> - Start fio with randrw on the third LV
> - Wait 5-10 minutes for the message to pop up and the system to lock up
> 
> > I wonder: in the event of a race, can the updated P-factor be invalid (ie,
> > partially populated)? I don't believe 64-bit assignments are atomic on all
> > architectures (see __update_writeback_rate) so you might get the upper or
> > lower 32 bits of the P-factor or of other values assigned therein.  Could
> > the rate be calculated in such a race by the writeback thread such that no
> > progress is made (eg, effectively rate==0)?
> 
> That seems to be right, yes. I think I can add that spinlock. Would you say
> the spinlock should be used for the sysfs-info as well?

Yes, anywhere it is used.  It might be a good idea to write a short 
accessor function if there are lots of places that need to grab the lock.  
Some kind of future-proofing would be a good idea to make it easy if 
anyone else grabs the value and doesn't realize it needs to be locked.

--
Eric Wheeler



> 
> We're using amd64 only and had the patch tested with IO stress tests for
> 10 to 14 days and running in prod for 3 months now without issue.
> But that doesn't mean anything for the other architectures...
> 
> Phillipp
> 
> > > Signed-off-by: Phillipp Röll <phillipp.roell@xxxxxxxxxxxxxx>
> > > ---
> > >  drivers/md/bcache/writeback.c | 4 ----
> > >  1 file changed, 4 deletions(-)
> > >
> > > diff --git a/drivers/md/bcache/writeback.c
> > b/drivers/md/bcache/writeback.c
> > > index 6ac2e48..90da3bd 100644
> > > --- a/drivers/md/bcache/writeback.c
> > > +++ b/drivers/md/bcache/writeback.c
> > > @@ -75,14 +75,10 @@ static void update_writeback_rate(struct
> > work_struct *work)
> > >  					     struct cached_dev,
> > >  					     writeback_rate_update);
> > >
> > > -	down_read(&dc->writeback_lock);
> > > -
> > >  	if (atomic_read(&dc->has_dirty) &&
> > >  	    dc->writeback_percent)
> > >  		__update_writeback_rate(dc);
> > >
> > > -	up_read(&dc->writeback_lock);
> > > -
> > >  	schedule_delayed_work(&dc->writeback_rate_update,
> > >  			      dc->writeback_rate_update_seconds * HZ);
> > >  }
> > > --
> > > 2.10.0
> > >
> > >
> > >
> 

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux