Re: [PATCH] xfs: shutdown on failure to add page to log bio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 24, 2020 at 01:29:49PM -0400, Brian Foster wrote:
> On Tue, Mar 24, 2020 at 10:18:59AM -0700, Darrick J. Wong wrote:
> > On Tue, Mar 24, 2020 at 12:57:00PM -0400, Brian Foster wrote:
> > > If the bio_add_page() call fails, we proceed to write out a
> > > partially constructed log buffer. This corrupts the physical log
> > > such that log recovery is not possible. Worse, persistent
> > > occurrences of this error eventually lead to a BUG_ON() failure in
> > > bio_split() as iclogs wrap the end of the physical log, which
> > > triggers log recovery on subsequent mount.
> > > 
> > > Rather than warn about writing out a corrupted log buffer, shutdown
> > > the fs as is done for any log I/O related error. This preserves the
> > > consistency of the physical log such that log recovery succeeds on a
> > > subsequent mount. Note that this was observed on a 64k page debug
> > > kernel without upstream commit 59bb47985c1d ("mm, sl[aou]b:
> > > guarantee natural alignment for kmalloc(power-of-two)"), which
> > > demonstrated frequent iclog bio overflows due to unaligned (slab
> > > allocated) iclog data buffers.
> > 
> > Fixes: tag?
> > 
> 
> I suppose you could argue it fixes commit 79b54d9bfcdcd ("xfs: use bios
> directly to write log buffers"), but I didn't include a tag because this
> is not really fixing a reproducible bug. It's fixing up the error
> handling based on a bad combination of patches in a distro kernel.
> Perhaps I'm just not clear on when we do or don't want a fixes tag..?

[Summarizing what I rambled about on IRC:]

>From my perspective, this looks like you concluded that the WARN_ON_ONCE
wasn't sufficient to deal with the error (because the physical log got
corrupted), so you're adding branch code to shut down the log.

Granted, it should only happen if bio_add_page fails, but as that's not
part of xfs, we have to code defensively enough to avoid breaking the
filesystem.

Looks ok, will add fixes tag and send it to the testcloud...
Reviewed-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>

--D

> Brian
> 
> > Otherwise, looks ok to me.
> > 
> > --D
> > 
> > > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
> > > ---
> > >  fs/xfs/xfs_log.c | 14 ++++++++++----
> > >  1 file changed, 10 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > > index 2a90a483c2d6..ebb6a5c95332 100644
> > > --- a/fs/xfs/xfs_log.c
> > > +++ b/fs/xfs/xfs_log.c
> > > @@ -1705,16 +1705,22 @@ xlog_bio_end_io(
> > >  
> > >  static void
> > >  xlog_map_iclog_data(
> > > -	struct bio		*bio,
> > > -	void			*data,
> > > +	struct xlog_in_core	*iclog,
> > >  	size_t			count)
> > >  {
> > > +	struct xfs_mount	*mp = iclog->ic_log->l_mp;
> > > +	struct bio		*bio = &iclog->ic_bio;
> > > +	void			*data = iclog->ic_data;
> > > +
> > >  	do {
> > >  		struct page	*page = kmem_to_page(data);
> > >  		unsigned int	off = offset_in_page(data);
> > >  		size_t		len = min_t(size_t, count, PAGE_SIZE - off);
> > >  
> > > -		WARN_ON_ONCE(bio_add_page(bio, page, len, off) != len);
> > > +		if (bio_add_page(bio, page, len, off) != len) {
> > > +			xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR);
> > > +			break;
> > > +		}
> > >  
> > >  		data += len;
> > >  		count -= len;
> > > @@ -1762,7 +1768,7 @@ xlog_write_iclog(
> > >  	if (need_flush)
> > >  		iclog->ic_bio.bi_opf |= REQ_PREFLUSH;
> > >  
> > > -	xlog_map_iclog_data(&iclog->ic_bio, iclog->ic_data, count);
> > > +	xlog_map_iclog_data(iclog, count);
> > >  	if (is_vmalloc_addr(iclog->ic_data))
> > >  		flush_kernel_vmap_range(iclog->ic_data, count);
> > >  
> > > -- 
> > > 2.21.1
> > > 
> > 
> 



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux