Re: xfs: fix CIL push hang in for-next tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 15, 2021 at 04:46:56PM +1000, Dave Chinner wrote:
> Hi folks,
> 
> This is the first fix for the problems Brian has reported from
> generic/019. This has fixed the hang, but the other log recovery
> problem he reported is still present (seen once with these patches
> in place).
> 
> I've tested these out to a couple of hundred cycles of
> continual looping generic/019 before the systems fall over with a
> perag reference count underrun at unmount after a shutdown. I'm
> pretty sure the hang is fixed, as it would manifest within 10-20
> cycles without this patch.
> 
> The first patch is the iclogbuf state tracing I used to capture the
> iclogbuf wrapping state. The second patch is the fix.

I found another bug while testing for-next.  If I run generic/100 more
than about ~30 times with a 1k block size:

FSTYP         -- xfs (debug)
PLATFORM      -- Linux/x86_64 flax-mtr00 5.13.0-rc4-djwx #rc4 SMP
PREEMPT Mon Jun 7 11:17:23 PDT 2021
MKFS_OPTIONS  -- -f -b size=1024, /dev/sdf
MOUNT_OPTIONS -- -o usrquota,grpquota,prjquota, /dev/sdf /opt

I see this in dmesg:

run fstests generic/100 at 2021-06-15 10:41:45
XFS (sda): ctx ticket reservation ran out. Need to up reservation
XFS (sda): ticket reservation summary:
XFS (sda):   unit res    = 47168 bytes
XFS (sda):   current res = -404 bytes
XFS (sda):   original count  = 1
XFS (sda):   remaining count = 1
XFS (sda): xfs_do_force_shutdown(0x2) called from line 2440 of file fs/xfs/xfs_log.c. Return address = xlog_write+0x608/0x640 [xfs]
XFS (sda): Log I/O Error Detected. Shutting down filesystem
XFS (sda): Please unmount the filesystem and rectify the problem(s)
XFS (sda): Unmounting Filesystem

Looking up that line in gdb produces:

0xffffffffa038a0a8 is in xlog_write (fs/xfs/xfs_log.c:2439).
2434            int                     log_offset;
2435
2436            if (ticket->t_curr_res < 0) {
2437                    xfs_alert_tag(log->l_mp, XFS_PTAG_LOGRES,
2438                         "ctx ticket reservation ran out. Need to up reservation");
2439                    xlog_print_tic_res(log->l_mp, ticket);
2440                    xfs_force_shutdown(log->l_mp, SHUTDOWN_LOG_IO_ERROR);
2441            }

I haven't applied these two patches yet, but looking back through
fstests reports I never saw this before the recent for-next push.
I'm uncertain if it's the CIL work or the xattr refactoring that did
this, though AFAICT generic/100 itself does not generate any xattrs and
I don't have any LSMs enabled that would cause them to be created.

--D

> 
> Cheers,
> 
> Dave.
> 
> 



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux