Re: softlockup with CONFIG_XFS_ONLINE_SCRUB enabled

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 28, 2019 at 08:30:03AM +0100, Christoph Hellwig wrote:
> On Sun, Oct 27, 2019 at 11:32:32AM -0700, Darrick J. Wong wrote:
> > On Fri, Oct 25, 2019 at 12:24:04PM +0200, Christoph Hellwig wrote:
> > > Hi Darrick,
> > > 
> > > the current xfs tree seems to easily cause sotlockups in generic/175
> > > (and a few other tests, but not as reproducible) for me.  This is on
> > > 20GB 4k block size images on a VM with 4 CPUs and 4G of RAM.
> > 
> > Hrm.  I haven't seen that before... what's your kernel config?
> > This looks like some kind of lockup in slub debugging...?
> > 
> > Also, is this a new thing?  Or something that used to happen with low
> > frequency but has slowly increased to the point that it's annoying?
> > 
> > (Or something else?)
> 
> Seems to happen with 5.3 as well.  I only recently turned
> CONFIG_XFS_ONLINE_SCRUB back on in my usual test config, that is what
> made it show up..
> 
> .config attached.

Aha, you have preempt disabled and slub debugging on by default, which
(on the million-extent files produced by generic/175) mean that scrub
takes long enough to trip the soft lockup watchdog while checking the
bmap.  The test eventually finishes, but the obvious(ly stupid) bandaid
of calling touch_softlockup_watchdog merely plunged the VM into
"rcu_sched self-detected stall on CPU" messages and as it's late I'll
set it aside until tomorrow.

IOWs I think I know what's going on but don't yet know how to fix it. :/

--D



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux