Re: [PATCH v2] xfs: load uncached unlinked inodes into memory on demand

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 31, 2023 at 03:39:28PM -0500, Eric Sandeen wrote:
> On 8/31/23 7:39 AM, Ritesh Harjani (IBM) wrote:
> > "Darrick J. Wong" <djwong@xxxxxxxxxx> writes:
> > 
> >> From: Darrick J. Wong <djwong@xxxxxxxxxx>
> >>
> >> shrikanth hegde reports that filesystems fail shortly after mount with
> >> the following failure:
> >>
> >> 	WARNING: CPU: 56 PID: 12450 at fs/xfs/xfs_inode.c:1839 xfs_iunlink_lookup+0x58/0x80 [xfs]
> >>
> >> This of course is the WARN_ON_ONCE in xfs_iunlink_lookup:
> >>
> >> 	ip = radix_tree_lookup(&pag->pag_ici_root, agino);
> >> 	if (WARN_ON_ONCE(!ip || !ip->i_ino)) { ... }
> >>
> >> From diagnostic data collected by the bug reporters, it would appear
> >> that we cleanly mounted a filesystem that contained unlinked inodes.
> >> Unlinked inodes are only processed as a final step of log recovery,
> >> which means that clean mounts do not process the unlinked list at all.
> >>
> >> Prior to the introduction of the incore unlinked lists, this wasn't a
> >> problem because the unlink code would (very expensively) traverse the
> >> entire ondisk metadata iunlink chain to keep things up to date.
> >> However, the incore unlinked list code complains when it realizes that
> >> it is out of sync with the ondisk metadata and shuts down the fs, which
> >> is bad.
> >>
> >> Ritesh proposed to solve this problem by unconditionally parsing the
> >> unlinked lists at mount time, but this imposes a mount time cost for
> >> every filesystem to catch something that should be very infrequent.
> >> Instead, let's target the places where we can encounter a next_unlinked
> >> pointer that refers to an inode that is not in cache, and load it into
> >> cache.
> >>
> >> Note: This patch does not address the problem of iget loading an inode
> >> from the middle of the iunlink list and needing to set i_prev_unlinked
> >> correctly.
> >>
> >> Reported-by: shrikanth hegde <sshegde@xxxxxxxxxxxxxxxxxx>
> >> Triaged-by: Ritesh Harjani <ritesh.list@xxxxxxxxx>
> >> Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx>
> >> ---
> >> v2: log that we're doing runtime recovery, dont mess with DONTCACHE,
> >>     and actually return ENOLINK
> >> ---
> >>  fs/xfs/xfs_inode.c |   75 +++++++++++++++++++++++++++++++++++++++++++++++++---
> >>  fs/xfs/xfs_trace.h |   25 +++++++++++++++++
> >>  2 files changed, 96 insertions(+), 4 deletions(-)
> > 
> > Hi Darrick,
> > 
> > Thanks for taking a look at this. I tested this patch on the setup where
> > Shrikanth earlier saw the crash. I still can see a problem. I saw it is
> > taking the branch from 
> > 
> > +	/* If this is not an unlinked inode, something is very wrong. */
> > +	if (VFS_I(next_ip)->i_nlink != 0) {
> > +		error = -EFSCORRUPTED;
> > +		goto rele;
> > +	}
> > 
> > Here are the logs of reference - 
> > 
> > [   21.399573] XFS (dm-0): Found unrecovered unlinked inode 0x2ec44d in AG 0x0.  Initiating recovery.
> > [   21.400150] XFS (dm-0): Internal error xfs_trans_cancel at line 1104 of file fs/xfs/xfs_trans.c.  Caller xfs_remove+0x1a0/0x310 [xfs]
> 
> Do you have a metadump for that filesystem, to examine that inode?

IIRC, Ritesh's problem was that there were inodes on the unlinked list
*and* they had nonzero i_nlink.  This patch doesn't address that; you'll
have to wait for the online repair version.

--D

> -Eric
> 
> > [   21.400222] CPU: 0 PID: 1629 Comm: systemd-tmpfile Not tainted 6.5.0+ #2
> > [   21.400226] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1010.22 (NH1010_122) hv:phyp pSeries
> > [   21.400230] Call Trace:
> > [   21.400231] [c000000014cdbb70] [c000000000f377b8] dump_stack_lvl+0x6c/0x9c (unreliable)
> > [   21.400239] [c000000014cdbba0] [c008000000f7c204] xfs_error_report+0x5c/0x80 [xfs]
> > [   21.400303] [c000000014cdbc00] [c008000000fab320] xfs_trans_cancel+0x178/0x1b0 [xfs]
> > [   21.400371] [c000000014cdbc50] [c008000000f999d8] xfs_remove+0x1a0/0x310 [xfs]
> > [   21.400432] [c000000014cdbcc0] [c008000000f93eb0] xfs_vn_unlink+0x68/0xf0 [xfs]
> > [   21.400493] [c000000014cdbd20] [c0000000005b8038] vfs_rmdir+0x178/0x300
> > [   21.400498] [c000000014cdbd60] [c0000000005be444] do_rmdir+0x124/0x240
> > [   21.400502] [c000000014cdbdf0] [c0000000005be594] sys_rmdir+0x34/0x50
> > [   21.400506] [c000000014cdbe10] [c000000000033c38] system_call_exception+0x148/0x3a0
> > [   21.400511] [c000000014cdbe50] [c00000000000c6d4] system_call_common+0xf4/0x258
> 
> 
> 



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux