Re: [PATCH] xfs: cancel eofblocks background trimming on remount read-only

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 06, 2016 at 03:24:10PM -0500, Eric Sandeen wrote:
> 
> 
> On 6/6/16 12:12 PM, Brian Foster wrote:
> > The filesystem quiesce sequence performs the operations necessary to
> > drain all background work, push pending transactions through the log
> > infrastructure and wait on I/O resulting from the final AIL push. We
> > have had reports of remount,ro hangs in xfs_log_quiesce() ->
> > xfs_wait_buftarg(), however, and some instrumentation code to detect
> > transaction commits at this point in the quiesce sequence has inculpated
> > the eofblocks background scanner as a cause.
> > 
> > While higher level remount code generally prevents user modifications by
> > the time the filesystem has made it to xfs_log_quiesce(), the background
> > scanner may still be alive and can perform pending work at any time. If
> > this occurs between the xfs_log_force() and xfs_wait_buftarg() calls
> > within xfs_log_quiesce(), this can lead to an indefinite lockup in
> > xfs_wait_buftarg().
> > 
> > To prevent this problem, cancel the background eofblocks scan worker
> > during the remount read-only quiesce sequence. This suspends background
> > trimming when a filesystem is remounted read-only. This is only done in
> > the remount path because the freeze codepath has already locked out new
> > transactions by the time the filesystem attempts to quiesce (and thus
> > waiting on an active work item could deadlock). Kick the eofblocks
> > worker to pick up where it left off once an fs is remounted back to
> > read-write.
> > 
> > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
> > ---
> > 
> > To confirm the problem, I have managed to manufacture the remount,ro
> > hang issue described above by hacking in some delays/coordination
> > between the quiesce and eofblocks background worker.
> > 
> > Also, an alternative approach that I was considering is to run a
> > synchronous scan around the same place this patch cancels the background
> > scan. The idea is that the background scan would then have no work to do
> > on the subsequent iteration and then die off naturally (until
> > preallocation occurs once again). I suppose the downside is that a
> > remount might not necessarily be expected to muck with preallocation
> > state of affected files. This patch seemed more simple, but I'm open to
> > either. Thoughts?
> 
> Your approach makes sense to me - "finishing" the scan prior to quiesce
> could take a relatively unknown amount of time as well, right?  And I guess
> I don't see a real advantage to that; we don't wait for it on unmount
> (right?)
> 

Yes, it could take an unknown amount of time. It would probably be
similar to what we would do on an unmount. We don't explicitly wait for
a scan on unmount (though we destroy the workqueue at some point), but I
believe we have to reclaim all cached inodes, which then trims eofblocks
for every inode where appropriate (via xfs_inactive()).

> Reviewed-by: Eric Sandeen <sandeen@xxxxxxxxxx>
>  

Thanks!

Brian

> > Brian
> > 
> >  fs/xfs/xfs_icache.c | 2 +-
> >  fs/xfs/xfs_icache.h | 1 +
> >  fs/xfs/xfs_super.c  | 8 ++++++++
> >  3 files changed, 10 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> > index 99ee6eee..fb39a66 100644
> > --- a/fs/xfs/xfs_icache.c
> > +++ b/fs/xfs/xfs_icache.c
> > @@ -765,7 +765,7 @@ restart:
> >   * Background scanning to trim post-EOF preallocated space. This is queued
> >   * based on the 'speculative_prealloc_lifetime' tunable (5m by default).
> >   */
> > -STATIC void
> > +void
> >  xfs_queue_eofblocks(
> >  	struct xfs_mount *mp)
> >  {
> > diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
> > index 62f1f91..05bac99 100644
> > --- a/fs/xfs/xfs_icache.h
> > +++ b/fs/xfs/xfs_icache.h
> > @@ -68,6 +68,7 @@ void xfs_inode_clear_eofblocks_tag(struct xfs_inode *ip);
> >  int xfs_icache_free_eofblocks(struct xfs_mount *, struct xfs_eofblocks *);
> >  int xfs_inode_free_quota_eofblocks(struct xfs_inode *ip);
> >  void xfs_eofblocks_worker(struct work_struct *);
> > +void xfs_queue_eofblocks(struct xfs_mount *);
> >  
> >  int xfs_inode_ag_iterator(struct xfs_mount *mp,
> >  	int (*execute)(struct xfs_inode *ip, int flags, void *args),
> > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> > index 4700f09..7965371 100644
> > --- a/fs/xfs/xfs_super.c
> > +++ b/fs/xfs/xfs_super.c
> > @@ -1294,6 +1294,7 @@ xfs_fs_remount(
> >  		 */
> >  		xfs_restore_resvblks(mp);
> >  		xfs_log_work_queue(mp);
> > +		xfs_queue_eofblocks(mp);
> >  	}
> >  
> >  	/* rw -> ro */
> > @@ -1306,6 +1307,13 @@ xfs_fs_remount(
> >  		 * return it to the same size.
> >  		 */
> >  		xfs_save_resvblks(mp);
> > +
> > +		/*
> > +		 * Cancel background eofb scanning so it cannot race with the
> > +		 * final log force+buftarg wait and deadlock the remount.
> > +		 */
> > +		cancel_delayed_work_sync(&mp->m_eofblocks_work);
> > +
> >  		xfs_quiesce_attr(mp);
> >  		mp->m_flags |= XFS_MOUNT_RDONLY;
> >  	}
> > 
> 
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs



[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux