On Thu, Mar 26, 2020 at 06:45:58PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > A customer reported rcu stalls and softlockup warnings on a computer > with many CPU cores and many many more IO threads trying to write to a > filesystem that is totally out of space. Subsequent analysis pointed to > the many many IO threads calling xfs_flush_inodes -> sync_inodes_sb, > which causes a lot of wb_writeback_work to be queued. The writeback > worker spends so much time trying to wake the many many threads waiting > for writeback completion that it trips the softlockup detector, and (in > this case) the system automatically reboots. That doesn't sound right. Each writeback work that is queued via sync_inodes_sb should only have a single process waiting on it's completion. And how many threads do you actually have to need to wake up for it to trigger a 10s soft-lockup timeout? More detail, please? > In addition, they complain that the lengthy xfs_flush_inodes scan traps > all of those threads in uninterruptible sleep, which hampers their > ability to kill the program or do anything else to escape the situation. > > Fix this by replacing the full filesystem flush (which is offloaded to a > workqueue which we then have to wait for) with directly flushing the > file that we're trying to write. Which does nothing to flush -other- outstanding delalloc reservations and allow the eofblocks/cowblock scan to reclaim unused post-EOF speculative preallocations. That's the purpose of the xfs_flush_inodes() - without it we can get very premature ENOSPC, especially on small filesystems when writing largish files in the background. So I'm not sure that dropping the sync is a viable solution. It is actually needed. Perhaps we need to go back to the ancient code thatonly allowed XFS to run a single xfs_flush_inodes() at a time - everything else waited on the single flush to complete, then all returned at the same time... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx