Re: XFS write cache flush policy

Dave Chinner <david@xxxxxxxxxxxxx> · Tue, 18 Dec 2012 10:24:41 +1100

On Mon, Dec 17, 2012 at 12:22:51AM +0100, Matthias Schniedermeyer wrote:
> On 17.12.2012 07:48, Dave Chinner wrote:
> > Can you provide the information I asked - what was there before the
> > unmount,
> 
> With time reduced to 2 minutes and everything put in a script (see below).
> 
> After about 2 minutes:
> ls -lR /mnt
> /mnt:
> total 0
> drwxr-xr-x 2 root root 105 Dec 17 00:13 a
> 
> /mnt/a:
> total 11186176
> -rw-r--r-- 1 root root 943718400 Dec 17 00:11 1
> -rw-r--r-- 1 root root 943718400 Dec 17 00:13 10
> -rw-r--r-- 1 root root 943718400 Dec 17 00:13 11
> -rw-r--r-- 1 root root 891450368 Dec 17 00:13 12
> -rw-r--r-- 1 root root 943718400 Dec 17 00:11 2
> -rw-r--r-- 1 root root 943718400 Dec 17 00:11 3
> -rw-r--r-- 1 root root 943718400 Dec 17 00:12 4
> -rw-r--r-- 1 root root 943718400 Dec 17 00:12 5
> -rw-r--r-- 1 root root 943718400 Dec 17 00:12 6
> -rw-r--r-- 1 root root 943718400 Dec 17 00:12 7
> -rw-r--r-- 1 root root 943718400 Dec 17 00:12 8
> -rw-r--r-- 1 root root 943718400 Dec 17 00:12 9
> 
> I yanked out the cable immediatly after the ls.
> 
> > what is there before recovery
> 
> xfs_logprint:
>     data device: 0x811
>     log device: 0x811 daddr: 104857632 length: 102400
> 
> cycle: 1        version: 2              lsn: 1,0        tail_lsn: 1,0
> length of Log Record: 20        prev offset: -1         num ops: 1
> uuid: b22d154a-b580-4d3a-bc09-ac0c62d39d39   format: little endian linux
> h_size: 32768
> ----------------------------------------------------------------------------
> Oper (0): tid: b0c0d0d0  len: 8  clientid: LOG  flags: UNMOUNT
> Unmount filesystem
> 
> ============================================================================
> xfs_logprint: skipped 4096 cleared blocks in range: 2 - 4097
> xfs_logprint: skipped 98302 zeroed blocks in range: 4098 - 102399
> xfs_logprint: physical end of log
> ============================================================================
> xfs_logprint: logical end of log
> ============================================================================

Not what I was after (i.e. the raw dump of the log to a file via
xfs_logprint -C), but in this case serves to indicate that nothing
ever got written to the log. That's not exactly what I was
expecting, given that you've been reporting that some files were
there after remount.

> mount /dev/sdb1 /mnt -oro,norecovery
> ls -lR /mnt
> /mnt:
> total 0
> 
> So nothing.

If nothing got written to the log, then no metadata got written
to disk because it is pinned in memory until the log is written.

> > And depending on what occurred, send me the log file so I
> > can see what was actually supposed to be replayed by log recovery?
> 
> Regarding to the 'event trace' from your other mail, i have no idea what
> or how to do that.

http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F

> Dec 17 00:13:28 localhost kernel: [  325.562642] lost page write due to I/O error on sdb1
> Dec 17 00:13:28 localhost kernel: [  325.562643] Buffer I/O error on device sdb1, logical block 2613285
> Dec 17 00:13:28 localhost kernel: [  325.562643] lost page write due to I/O error on sdb1
> Dec 17 00:13:28 localhost kernel: [  325.562644] Buffer I/O error on device sdb1, logical block 2613286
> Dec 17 00:13:28 localhost kernel: [  325.562644] lost page write due to I/O error on sdb1
> Dec 17 00:13:28 localhost kernel: [  325.562645] Buffer I/O error on device sdb1, logical block 2613287
> Dec 17 00:13:28 localhost kernel: [  325.562645] lost page write due to I/O error on sdb1
> Dec 17 00:13:28 localhost kernel: [  325.562646] Buffer I/O error on device sdb1, logical block 2613288
> Dec 17 00:13:28 localhost kernel: [  325.562646] lost page write due to I/O error on sdb1
> Dec 17 00:13:28 localhost kernel: [  325.647364] XFS (sdb1): metadata I/O error: block 0x6400022 ("xlog_iodone") error 5 numblks 64
> Dec 17 00:13:28 localhost kernel: [  325.648559] XFS (sdb1): xfs_do_force_shutdown(0x2) called from line 1074 of file /xssd/usr_src/linux/fs/xfs/xfs_log.c.  Return address = 0xffffffff812a1e49
> Dec 17 00:13:28 localhost kernel: [  325.649804] XFS (sdb1): Log I/O Error Detected.  Shutting down filesystem

And there's why nothing got written to the log - you pulled the plug
before it was flushed. But that also implies it wasn't flushed for a
couple of minutes, which is not correct. They should be occurring
every 30s when the filesystem is only slightly busy like this.

Hmmmm, that rigns a bell. I think I came across this when doing a
RHEL6 backport recently, and decided that nothing needed doing
upstream because there were already patches pending to fix it.
That is, there is a logic bug is in xfssyncd_worker(), and that was
fixed by the xfssyncd-die-die-die patchset that is now in 3.8-rc1.

So, two things: firstly, can you check that the oldest file that is
lost on 3.8-rc1 is less than 30s old, and secondly, whether this
patch gives the same result as 3.8-rc1 on 3.6/3.7:

diff --git a/fs/xfs/xfs_sync.c b/fs/xfs/xfs_sync.c
index 9500caf..7bf85e8 100644
--- a/fs/xfs/xfs_sync.c
+++ b/fs/xfs/xfs_sync.c
@@ -400,7 +400,7 @@ xfs_sync_worker(
 	 * cancel_delayed_work_sync on this work queue before tearing down
 	 * the ail and the log in xfs_log_unmount.
 	 */
-	if (!(mp->m_super->s_flags & MS_ACTIVE) &&
+	if ((mp->m_super->s_flags & MS_ACTIVE) &&
 	    !(mp->m_flags & XFS_MOUNT_RDONLY)) {
 		/* dgc: errors ignored here */
 		if (mp->m_super->s_writers.frozen == SB_UNFROZEN &&


Cheers,

Dave.

-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs