Re: XFS write cache flush policy

Matthias Schniedermeyer <ms@xxxxxxx> · Tue, 18 Dec 2012 01:26:24 +0100

On 18.12.2012 10:24, Dave Chinner wrote:
> On Mon, Dec 17, 2012 at 12:22:51AM +0100, Matthias Schniedermeyer wrote:
> 
> > Dec 17 00:13:28 localhost kernel: [  325.562642] lost page write due to I/O error on sdb1
> > Dec 17 00:13:28 localhost kernel: [  325.562643] Buffer I/O error on device sdb1, logical block 2613285
> > Dec 17 00:13:28 localhost kernel: [  325.562643] lost page write due to I/O error on sdb1
> > Dec 17 00:13:28 localhost kernel: [  325.562644] Buffer I/O error on device sdb1, logical block 2613286
> > Dec 17 00:13:28 localhost kernel: [  325.562644] lost page write due to I/O error on sdb1
> > Dec 17 00:13:28 localhost kernel: [  325.562645] Buffer I/O error on device sdb1, logical block 2613287
> > Dec 17 00:13:28 localhost kernel: [  325.562645] lost page write due to I/O error on sdb1
> > Dec 17 00:13:28 localhost kernel: [  325.562646] Buffer I/O error on device sdb1, logical block 2613288
> > Dec 17 00:13:28 localhost kernel: [  325.562646] lost page write due to I/O error on sdb1
> > Dec 17 00:13:28 localhost kernel: [  325.647364] XFS (sdb1): metadata I/O error: block 0x6400022 ("xlog_iodone") error 5 numblks 64
> > Dec 17 00:13:28 localhost kernel: [  325.648559] XFS (sdb1): xfs_do_force_shutdown(0x2) called from line 1074 of file /xssd/usr_src/linux/fs/xfs/xfs_log.c.  Return address = 0xffffffff812a1e49
> > Dec 17 00:13:28 localhost kernel: [  325.649804] XFS (sdb1): Log I/O Error Detected.  Shutting down filesystem
> 
> And there's why nothing got written to the log - you pulled the plug
> before it was flushed. But that also implies it wasn't flushed for a
> couple of minutes, which is not correct. They should be occurring
> every 30s when the filesystem is only slightly busy like this.
> 
> Hmmmm, that rigns a bell. I think I came across this when doing a
> RHEL6 backport recently, and decided that nothing needed doing
> upstream because there were already patches pending to fix it.
> That is, there is a logic bug is in xfssyncd_worker(), and that was
> fixed by the xfssyncd-die-die-die patchset that is now in 3.8-rc1.

One 'bleeding edge' kernel, so new that it isn't even called 3.8-rc1. 
;-)

I would call that successful.

ls -l /mnt/a
total 10661888
drwxr-xr-x 2 root root       105 Dec 18 00:55 .
drwxr-xr-x 3 root root        14 Dec 18 00:53 ..
-rw-r--r-- 1 root root 943718400 Dec 18 00:53 1
-rw-r--r-- 1 root root 943718400 Dec 18 00:54 10
-rw-r--r-- 1 root root 738263040 Dec 18 00:55 11
-rw-r--r-- 1 root root 322990080 Dec 18 00:55 12
-rw-r--r-- 1 root root 943718400 Dec 18 00:53 2
-rw-r--r-- 1 root root 943718400 Dec 18 00:53 3
-rw-r--r-- 1 root root 943718400 Dec 18 00:53 4
-rw-r--r-- 1 root root 943718400 Dec 18 00:53 5
-rw-r--r-- 1 root root 943718400 Dec 18 00:54 6
-rw-r--r-- 1 root root 943718400 Dec 18 00:54 7
-rw-r--r-- 1 root root 943718400 Dec 18 00:54 8
-rw-r--r-- 1 root root 943718400 Dec 18 00:54 9

The 12th file was beeing written while i yanked out the cable. Strange 
that the 11th file isn't complete, i would expect correct ordering of 
events. Is this normal/expected behaviour for XFS?

The 'ls -l/mnt/a' before i yanked out the cable was this:
/mnt/a:
total 11186176
-rw-r--r-- 1 root root 943718400 Dec 18 00:53 1
-rw-r--r-- 1 root root 943718400 Dec 18 00:54 10
-rw-r--r-- 1 root root 943718400 Dec 18 00:55 11
-rw-r--r-- 1 root root 665326592 Dec 18 00:55 12
-rw-r--r-- 1 root root 943718400 Dec 18 00:53 2
-rw-r--r-- 1 root root 943718400 Dec 18 00:53 3
-rw-r--r-- 1 root root 943718400 Dec 18 00:53 4
-rw-r--r-- 1 root root 943718400 Dec 18 00:53 5
-rw-r--r-- 1 root root 943718400 Dec 18 00:54 6
-rw-r--r-- 1 root root 943718400 Dec 18 00:54 7
-rw-r--r-- 1 root root 943718400 Dec 18 00:54 8
-rw-r--r-- 1 root root 943718400 Dec 18 00:54 9

> So, two things: firstly, can you check that the oldest file that is
> lost on 3.8-rc1 is less than 30s old, and secondly, whether this

I would say: A definite yes

> patch gives the same result as 3.8-rc1 on 3.6/3.7:

3.7.1 + patch

/mnt/a:
total 8425472
-rw-r--r-- 1 root root 943718400 Dec 18 01:12 1
-rw-r--r-- 1 root root  92282880 Dec 18 01:13 10
-rw-r--r-- 1 root root 943718400 Dec 18 01:12 2
-rw-r--r-- 1 root root 943718400 Dec 18 01:12 3
-rw-r--r-- 1 root root 943718400 Dec 18 01:12 4
-rw-r--r-- 1 root root 943718400 Dec 18 01:12 5
-rw-r--r-- 1 root root 943718400 Dec 18 01:13 6
-rw-r--r-- 1 root root 943718400 Dec 18 01:13 7
-rw-r--r-- 1 root root 943718400 Dec 18 01:13 8
-rw-r--r-- 1 root root 507555840 Dec 18 01:13 9

Looks the same to me, only different timing in regards to 30s windows.

This was just before the yanking:
/mnt/a:
total 11186176
-rw-r--r-- 1 root root 943718400 Dec 18 01:12 1
-rw-r--r-- 1 root root 943718400 Dec 18 01:13 10
-rw-r--r-- 1 root root 943718400 Dec 18 01:13 11
-rw-r--r-- 1 root root 869729280 Dec 18 01:14 12
-rw-r--r-- 1 root root 943718400 Dec 18 01:12 2
-rw-r--r-- 1 root root 943718400 Dec 18 01:12 3
-rw-r--r-- 1 root root 943718400 Dec 18 01:12 4
-rw-r--r-- 1 root root 943718400 Dec 18 01:12 5
-rw-r--r-- 1 root root 943718400 Dec 18 01:13 6
-rw-r--r-- 1 root root 943718400 Dec 18 01:13 7
-rw-r--r-- 1 root root 943718400 Dec 18 01:13 8
-rw-r--r-- 1 root root 943718400 Dec 18 01:13 9

> diff --git a/fs/xfs/xfs_sync.c b/fs/xfs/xfs_sync.c
> index 9500caf..7bf85e8 100644
> --- a/fs/xfs/xfs_sync.c
> +++ b/fs/xfs/xfs_sync.c
> @@ -400,7 +400,7 @@ xfs_sync_worker(
>  	 * cancel_delayed_work_sync on this work queue before tearing down
>  	 * the ail and the log in xfs_log_unmount.
>  	 */
> -	if (!(mp->m_super->s_flags & MS_ACTIVE) &&
> +	if ((mp->m_super->s_flags & MS_ACTIVE) &&
>  	    !(mp->m_flags & XFS_MOUNT_RDONLY)) {
>  		/* dgc: errors ignored here */
>  		if (mp->m_super->s_writers.frozen == SB_UNFROZEN &&
> 
> 

Do you need any logs or other things?

-- 

Matthias

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs