Re: [PATCH] xfs: fix iclog release error check race with shutdown

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 18, 2020 at 07:53:13AM -0800, Christoph Hellwig wrote:
> On Mon, Feb 17, 2020 at 10:29:15AM -0500, Brian Foster wrote:
> > On Mon, Feb 17, 2020 at 05:33:14AM -0800, Christoph Hellwig wrote:
> > > On Fri, Feb 14, 2020 at 01:15:28PM -0500, Brian Foster wrote:
> > > > Prior to commit df732b29c8 ("xfs: call xlog_state_release_iclog with
> > > > l_icloglock held"), xlog_state_release_iclog() always performed a
> > > > locked check of the iclog error state before proceeding into the
> > > > sync state processing code. As of this commit, part of
> > > > xlog_state_release_iclog() was open-coded into
> > > > xfs_log_release_iclog() and as a result the locked error state check
> > > > was lost.
> > > > 
> > > > The lockless check still exists, but this doesn't account for the
> > > > possibility of a race with a shutdown being performed by another
> > > > task causing the iclog state to change while the original task waits
> > > > on ->l_icloglock. This has reproduced very rarely via generic/475
> > > > and manifests as an assert failure in __xlog_state_release_iclog()
> > > > due to an unexpected iclog state.
> > > > 
> > > > Restore the locked error state check in xlog_state_release_iclog()
> > > > to ensure that an iclog state update via shutdown doesn't race with
> > > > the iclog release state processing code.
> > > > 
> > > > Reported-by: Zorro Lang <zlang@xxxxxxxxxx>
> > > > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
> > > > ---
> > > >  fs/xfs/xfs_log.c | 4 ++++
> > > >  1 file changed, 4 insertions(+)
> > > > 
> > > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > > > index f6006d94a581..f38fc492a14d 100644
> > > > --- a/fs/xfs/xfs_log.c
> > > > +++ b/fs/xfs/xfs_log.c
> > > > @@ -611,6 +611,10 @@ xfs_log_release_iclog(
> > > >  	}
> > > >  
> > > >  	if (atomic_dec_and_lock(&iclog->ic_refcnt, &log->l_icloglock)) {
> > > > +		if (iclog->ic_state == XLOG_STATE_IOERROR) {
> > > > +			spin_unlock(&log->l_icloglock);
> > > > +			return -EIO;
> > > > +		}
> > > 
> > > So the check just above also shuts the file system down.  Any reason to
> > > do that in one case and not the other?
> > > 
> > 
> > The initial check (with the shutdown) was originally associated with the
> > return from xlog_state_release_iclog(). That covers both state checks,
> > as they were both originally within that function. My impression was
> > there isn't a need to shutdown in the second check because the only way
> > the iclog state changes to IOERROR across that lock cycle is due to a
> > shutdown already in progress.
> 
> The original code did the force shutdown for both cases.  So unless we
> have a good reason to do it differently I'd just add a goto label and
> merge the two cases to restore the old behavior.
> 

Ok. I'm not sure I see the point, but it's harmless and I can make
Eric's fix as well so I'll post a v2..

Brian




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux