Re: [PATCH 1/2] xfs: zero length symlinks are not valid

"Darrick J. Wong" <darrick.wong@xxxxxxxxxx> · Sat, 23 Jun 2018 10:38:01 -0700

On Fri, Jun 22, 2018 at 06:44:48AM -0400, Brian Foster wrote:
> On Thu, Jun 21, 2018 at 03:53:46PM -0700, Darrick J. Wong wrote:
> > On Fri, Jun 22, 2018 at 08:31:41AM +1000, Dave Chinner wrote:
> > > On Thu, Jun 21, 2018 at 07:46:18AM -0400, Brian Foster wrote:
> > > > On Thu, Jun 21, 2018 at 08:59:18AM +1000, Dave Chinner wrote:
> > > > > On Wed, Jun 20, 2018 at 07:50:48AM -0400, Brian Foster wrote:
> > > > > > If I recreate that same dirty log state and mount with this patch
> > > > > > applied (note that the fs is created without this patch to simulate an
> > > > > > old kernel that has not changed i_mode in the same transaction that sets
> > > > > > di_size = 0) along with a hack to avoid the check in
> > > > > > xfs_dinode_verify(), I now hit the new assert and corruption error
> > > > > > that's been placed in xfs_inactive_symlink().
> > > > > > 
> > > > > > So to Darrick's point, that seems to show that this is a vector to the
> > > > > > pre-existing len == 0 check in xfs_inactive_symlink(). Given that, it
> > > > > > seems to me that if we want to handle recovery from this state, we'd
> > > > > > still need to work around the verifier check and retain the initial
> > > > > > di_size == 0 check in xfs_inactive_symlink().
> > > > > 
> > > > > I think we should get rid of the transient state, not continue to
> > > > > work around it. Because the transient state only exists in a log
> > > > > recovery context, we can change the behaviour and not care about
> > > > > older kernels still having the problem because all that is needed to
> > > > > avoid the issue is a clean log when upgrading the kernel.
> > > > > 
> > > > 
> > > > Right... that sounds reasonable to me, but we still need to consider how
> > > > we handle a filesystem already in this state. BTW, was this a user
> > > > report or a manufactured corruption due to fuzzing/shutdown testing or
> > > > something?
> > > 
> > > It was shutdown testing. The report was that xfs_repair -n failed
> > > with a zero length symlink, not that log recovery failed. I assumed
> > > that log recovery worked fine before xfs_repair -n was run because
> > > it didn't warn about a dirty log. However, now that you point out
> > > that we just toss unlink list recovery failures (which I'd forgotten
> > > about!), I'm guessing that happened and then repair tripped over
> > > the leaked symlink inode.
> > > 
> > > > Given that additional context, I don't feel too strongly about needing
> > > > to specially handle the "zero length symlink already exists in the dirty
> > > > log due to xfs_inactive_symlink_rmt()" case. Chances are the mount that
> > > > reported the error already nuked the state in the first place, so users
> > > > shouldn't really end up "stuck" somewhere between needing a kernel fix
> > > > or an 'xfs_repair -L' run (but if that's the approach we take, please
> > > > just note the tradeoff in the commit log). Just my .02.
> > > 
> > > Yup, that seems reasonable to me - leaking the inode until repair is
> > > done has no user impact.  It will do the same thing for any inode it
> > > finds on the unlinked list that is corrupted, so my thoughts are
> > > that if we want to improve corruption handling we need to address
> > > the general unlinked list corruption issue rather than just this
> > > specific inode corruption case.
> > 
> > FWIW I saw generic/475 trip over this last night and judging from the
> > logs I saw that behavior -- first we shut down right committing the
> > zero-length symlink, then recovery makes it as far as iunlink processing
> > where it blows up, tosses out that failure, and then the next time we
> > hit that inode we'd trip over it all over again.
> > 
> 
> What do you mean by "the next time we hit that inode?" IIUC, iunlink
> processing nukes the entire AGI unlinked list bucket in response to the
> read error. Given the inode was unlinked, shouldn't it no longer be
> accessible after that point? Are you hitting some other path that
> attempts to read the inode?

Aha, xfs_scrub finds the busted inaccessible inode when it scans the
filesystem, though frustratingly the fs shuts down because xfs_inactive
gets called on the corrupted symlink when xfs_fs_destroy_inode ->
xfs_inactive -> xfs_inactive_symlink... will have to review scrub's
iget use.

--D

> Brian
> 
> > --D
> > 
> > > Alright - I'll clean up the patch, make notes of all this in the
> > > commit message and repost.
> > > 
> > > Thanks, Brian!
> > > 
> > > Cheers,
> > > 
> > > Dave.
> > > 
> > > -- 
> > > Dave Chinner
> > > david@xxxxxxxxxxxxx
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html