Re: Question about 67dc288c ("xfs: ensure verifiers are attached to recovered buffers")

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 16, 2017 at 03:18:18PM -0700, Darrick J. Wong wrote:
> On Tue, Oct 17, 2017 at 08:29:33AM +1100, Dave Chinner wrote:
> > On Sat, Oct 14, 2017 at 12:05:30PM -0700, Darrick J. Wong wrote:
> > > On Sat, Oct 14, 2017 at 07:55:51AM -0400, Brian Foster wrote:
> > > > On Fri, Oct 13, 2017 at 11:49:16AM -0700, Darrick J. Wong wrote:
> > > > > Hi all,
> > > > > 
> > > > > I have a question about 67dc288c ("xfs: ensure verifiers are attached to
> > > > > recovered buffers").  I was analyzing a scrub failure on generic/392
> > > > > with a v4 filesystem which stems from xfs_scrub_buffer_recheck (it's in
> > > > > scrub part 4) being unable to find a b_ops attached to the AGF buffer
> > > > > and signalling error.
> > > > > 
> > > > > The pattern I observe is that when log recovery runs on a v4 filesystem,
> > > > > we call some variant of xfs_buf_read with a NULL ops parameter.  The
> > > > > buffer therefore gets created and read without any verifiers.
> > > > > Eventually, xlog_recover_validate_buf_type gets called, and on a v5
> > > > > filesystem we come back and attach verifiers and all is well.  However,
> > > > > on a v4 filesystem the function returns without doing anything, so the
> > > > > xfs_buf just sits around in memory with no verifier.  Subsequent
> > > > > read/log/relse patterns can write anything they want without write
> > > > > verifiers to check that.
> > > > > 
> > > > > If the v4 fs didn't need log recovery, the buffers get created with
> > > > > b_ops as you'd expect.
> > > > > 
> > > > > My question is, shouldn't xlog_recover_validate_buf_type unconditionally
> > > > > set b_ops and save the "if (hascrc)" bits for the part that ensures the
> > > > > LSN is up to date?
> > > > > 
> > > > 
> > > > Seems reasonable, but I notice that the has_crc() check around
> > > > _validate_buf_type() comes in sometime after the the original commit
> > > > referenced below (d75afeb3) and commit 67dc288c. It appears to be due to
> > > > commit 9222a9cf86 ("xfs: don't shutdown log recovery on validation
> > > > errors").
> > > > 
> > > > IIRC, the problem there is that log recovery had traditionally always
> > > > unconditionally replayed everything in the log over whatever resides in
> > > > the fs. This actually meant that recovery could transiently corrupt
> > > > buffers in certain cases if the target buffer happened to be relogged
> > > > more than once and was already up to date, which leads to verification
> > > > failures. This was addressed for v5 filesystems with LSN ordering rules,
> > > > but the challenge for v4 filesystems was that there is no metadata LSN
> > > > and thus no means to detect whether a buffer is already up to date with
> > > > regard to a transaction in the log.
> > > > 
> > > > Dave might have more historical context to confirm that... If that is
> > > > still an open issue, a couple initial ideas come to mind:
> > > > 
> > > > 1.) Do something simple/crude like reclaim all buffers after log
> > > > recovery on v4 filesystems to provide a clean slate going forward.
> > > > 
> > > > 2.) Unconditionally attach verifiers during recovery as originally done
> > > > and wire up something generic that short circuits verifier invocations
> > > > on v4 filesystems when log recovery is in progress.
> > > 
> > > What do you think about 3) add b_ops later if they're missing?
> > > 
> > > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> > > index 2f97c12..8842a27 100644
> > > --- a/fs/xfs/xfs_buf.c
> > > +++ b/fs/xfs/xfs_buf.c
> > > @@ -742,6 +742,15 @@ xfs_buf_read_map(
> > >         if (bp) {
> > >                 trace_xfs_buf_read(bp, flags, _RET_IP_);
> > >  
> > > +               /*
> > > +                * If this buffer is up to date and has no verifier, try
> > > +                * to set one.  This can happen on v4 because log
> > > +                * recovery reads in the buffers for replay but does not
> > > +                * set b_ops.
> > > +                */
> > > +               if ((bp->b_flags & XBF_DONE) && !bp->b_ops)
> > > +                       bp->b_ops = ops;
> > 
> > I don't really like this because it will hide bugs in the code.
> > 
> > It also doesn't solve the problem because a read with NULL ops will
> > still leave a buffer with no ops attached.
> > 
> > IMO, if we've read/created a buffer without ops, then it is up to
> > the code that created/read it to either attach the required ops
> > before the buffer is released or to invalidate the buffer before
> > anyone else can use it or write it. The buffer write code warns
> > about writing buffers without verfiers, but we can't warn on read
> > because read-with-NULL-verifier is a valid thing to do....
> 
> Fair 'nuff.  FWIW I'm ok with approach #1 if anyone enthusiastically
> wants to write it up.
> 

What do you guys think about something like this? It runs
unconditionally, but this is partly as designed to provide a clean cache
on v5 filesystems as well. It should be mostly unnoticeable if log
recovery hasn't run, but we could easily add another flag to make it
conditional on log recovery if desired.

Brian

--- 8< ---

diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index dc95a49..226d26b 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -780,6 +780,16 @@ xfs_log_mount_finish(
 	mp->m_super->s_flags &= ~MS_ACTIVE;
 	evict_inodes(mp->m_super);
 
+	/*
+	 * Drain the buffer LRU after log recovery. This is required for
+	 * v4 filesystems to avoid leaving around buffers with NULL verifier
+	 * ops. Do it unconditionally to make sure we're always in a clean cache
+	 * state after mount.
+	 */
+	xfs_log_force(mp, XFS_LOG_SYNC);
+	xfs_ail_push_all_sync(mp->m_ail);
+	xfs_wait_buftarg(mp->m_ddev_targp);
+
 	if (readonly)
 		mp->m_flags |= XFS_MOUNT_RDONLY;
 
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux