Re: [PATCH 1/2] xfs: zero length symlinks are not valid

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 19, 2018 at 07:54:34AM -0400, Brian Foster wrote:
> On Tue, Jun 19, 2018 at 08:42:59AM +1000, Dave Chinner wrote:
> > On Mon, Jun 18, 2018 at 09:24:11AM -0400, Brian Foster wrote:
> > > On Mon, Jun 18, 2018 at 03:57:10PM +1000, Dave Chinner wrote:
> > > > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > > > 
> > > > A log recovery failure has been reproduced where a symlink inode has
> > > > a zero length in extent form. It was caused by a shutdown during a
> > > > combined fstress+fsmark workload.
> > > > 
> > > > The underlying problem is the issue in xfs_inactive_symlink(): the
> > > > inode is unlocked between the symlink inactivation/truncation and
> > > > the inode being freed. This opens a window for the inode to be
> > > > written to disk before it xfs_ifree() removes it from the unlinked
> > > > list, marks it free in the inobt and zeros the mode.
> > > > 
> > > > For shortform inodes, the fix is simple. xfs_ifree() clears the data
> > > > fork state, so there's no need to do it in xfs_inactive_symlink().
> > > > This means the shortform fork verifier will not see a zero length
> > > > data fork as it mirrors the inode size through to xfs_ifree()), and
> > > > hence if the inode gets written back and the fork verifiers are run
> > > > they will still see a fork that matches the on-disk inode size.
> > > > 
> > > > For extent form (remote) symlinks, it is a little more tricky. Here
> > > > we explicitly set the inode size to zero, so the above race can lead
> > > > to zero length symlinks on disk. Because the inode is unlinked at
> > > > this point (i.e. on the unlinked list) and unreferenced, it can
> > > > never be seen again by a user. Hence when we set the inode size to
> > > 
> > > Perhaps we don't care, but what about bulkstat? The inode free is
> > > imminent, so it probably doesn't matter that much.
> > 
> > Bulkstat users have to deal with all sorts of TOCTOU races because
> > we do the inode lookup after unlocking the AGI. Hence the inodes
> > returned are not guaranteed to be allocated or even accessible,
> > especially if they return nlink == 0...
> > 
> > So, yeah, I don't think it matters for bulkstat.
> > 
> 
> Ack.
> 
> > > > @@ -523,17 +531,10 @@ xfs_inactive_symlink(
> > > >  		return -EIO;
> > > >  
> > > >  	xfs_ilock(ip, XFS_ILOCK_EXCL);
> > > > -
> > > > -	/*
> > > > -	 * Zero length symlinks _can_ exist.
> > > > -	 */
> > > >  	pathlen = (int)ip->i_d.di_size;
> > > > -	if (!pathlen) {
> > > > -		xfs_iunlock(ip, XFS_ILOCK_EXCL);
> > > > -		return 0;
> > > > -	}
> > > > +	ASSERT(pathlen);
> > > 
> > > Ok, but I still feel like I'm missing something here. The commit log
> > > explains how symlink inode inactivation creates this transient zero-size
> > > symlink state. That's fine, but why do we have code to check for such
> > > symlinks _in_ the symlink activation path?
> > 
> > Ah, I put that in the cover letter - the VFS is supposed to ensure
> > we don't get zero length symlinks passed to us, and the verifiers
> > are supposed to catch it coming off disk. This is mostly just a
> > bounds check to document that we don't expect zero length symlinks
> > at any time, especially in the unlink path. I put the same assert
> > check in the xfs_symlink() create path, too, to ensure we don't get
> > passed a zero length from the VFS...
> > 
> 
> Oh, I'm not referring to the ASSERT() here. That all seems fine...
> 
> > > Is the assumption that this is just an overzealous/unnecessary check, or
> > > am I missing something? If the former, an extra sentence in the last
> > > paragraph or so of the commit log to explain why we remove this would be
> > > nice.
> > 
> > It's a "should never occur" situation but, well, paranoia seeing as
> > we've got this wrong for so long...
> > 
> 
> Rather, I'm asking about the pre-existing code that we remove. The hunk
> just above from xfs_inactive_symlink():
> 
> -	/*
> -	 * Zero length symlinks _can_ exist.
> -	 */
> -	if (!pathlen) {
> -		xfs_iunlock(ip, XFS_ILOCK_EXCL);
> -		return 0;
> -	}
> 
> ... obviously suggests that there could have been a situation where we'd
> see a zero-length symlink on entry to xfs_inactive_symlink(). This
> patch, however, focuses on fixing the transient zero-length symlink
> caused by xfs_inactive_symlink_rmt() (which comes after the above
> check).
> 
> The latter implies we'd never enter the symlink inactivate path with a
> zero-length symlink, since this is the path that causes it (until the
> inode is ultimately freed). So my question essentially is do we have any
> idea why this code exists in the first place? Is it just bogus given
> there's no other clear vector to a zero-length symlink before the inode
> is inactivated?

I think that vector is log recovery.  Say we have a filesystem running
on an old kernel, and the fs goes down immediately after the transaction
commit in xfs_inactive_symlink_rmt.  At that exact moment we have a
zero-size extents format symlink on the unlinked list, and if we boot up
with a new kernel and go through log recovery we'll try to free this
inode in xlog_recover_process_one_iunlink.

The new code that switches i_mode prevents us from dropping more of
these onto the disk, but we still have old filesystems to deal with.
FWIW every now and then generic/475 complains in xfs_repair -n about
such a symlink (dfork format 2 and size 0).

--D

> Brian
> 
> > > > -	if (pathlen < 0 || pathlen > XFS_SYMLINK_MAXLEN) {
> > > > +	if (pathlen <= 0 || pathlen > XFS_SYMLINK_MAXLEN) {
> > > >  		xfs_alert(mp, "%s: inode (0x%llx) bad symlink length (%d)",
> > > >  			 __func__, (unsigned long long)ip->i_ino, pathlen);
> > > >  		xfs_iunlock(ip, XFS_ILOCK_EXCL);
> > > > @@ -541,12 +542,12 @@ xfs_inactive_symlink(
> > > >  		return -EFSCORRUPTED;
> > > >  	}
> > > >  
> > > > +	/*
> > > > +	 * Inline fork state gets removed by xfs_difree() so we have nothing to
> > > > +	 * do here in that case.
> > > > +	 */
> > > 
> > > It looks like that actually happens in xfs_ifree() (not xfs_difree()).
> > 
> > Yup, will fix.
> > 
> > Cheers,
> > 
> > Dave.
> > -- 
> > Dave Chinner
> > david@xxxxxxxxxxxxx
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux