Re: [PATCH] xfs: flush workers before stopping log

Dave Chinner <david@xxxxxxxxxxxxx> · Thu, 30 Aug 2012 10:23:35 +1000

On Wed, Aug 29, 2012 at 08:46:25AM -0500, tinguely@xxxxxxx wrote:
> The unmount race continues with our test boxes.
> 
> The below trace gave the clue that there is a write of the superblock
> after the log UNMOUNT record and xfs_logprint confirmed this write.
> 
> A couple different experiments points to the sync worker. The simplest
> solution is to moved the final flush of the workers before the final
> superblock write so there is no other filesystem activity after the
> UNMOUNT record is written to the log.

....
>  #8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
>  #9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
> #10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
> #11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
> #12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
> #13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
> #14 [c5377f58] process_one_work at c024ee4c
> #15 [c5377f98] worker_thread at c024f43d
> #16 [c5377fbc] kthread at c025326b
> #17 [c5377fe8] kernel_thread_helper at c070e834
> 
> PID: 26653  TASK: e79143b0  CPU: 3   COMMAND: "umount"
>  #0 [cde0fda0] __schedule at c0706595
>  #1 [cde0fe28] schedule at c0706b89
>  #2 [cde0fe30] schedule_timeout at c0705600
>  #3 [cde0fe94] __down_common at c0706098
>  #4 [cde0fec8] __down at c0706122
>  #5 [cde0fed0] down at c025936f
>  #6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
>  #7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]

OK, so you've got IO on the superblock buffer still active when the
superblock is being freed.

> There should be no more I/O after the UNMOUNT record is written to the log.

That depends - a freeze leaves the filesystem in exactly this state.
:)

> Flush the workers before the final sync of the superblock, write of the
> UNMOUNT log record and tearing down the log.
> 
> This earlier flush prevents a late write of the superblock that raced with
> the fiesystem shutdown.

I'm not sure the xfs_sync_work can be responsible for this - the
xfs_sync_worker() has a MS_ACTIVE guard on it, so it will not log a
dummy record (superblock) during the unmount procedure, nor does it
dispatch supblock buffer IO, so it can not be responsible for the
item in the log after the unmount record or the IO that is being
run.

> Index: b/fs/xfs/xfs_mount.c
> ===================================================================
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -1490,6 +1490,11 @@ xfs_unmountfs(
>  
>  	xfs_qm_unmount(mp);
>  
> +	/* flush the worker queues while the log still exists and
> +	 * before the final sync and unmount record.
> +	 */
> +	xfs_syncd_stop(mp);

xfs_syncd_stop() needs to die rather than being moved from place to
place every time some problem is seemd.  I outlined what we need to
do to solve the problems once and for all a couple of months ago:

http://oss.sgi.com/archives/xfs/2012-06/msg00064.html

I could use a break from backporting and testing - I might do this
today...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs