Re: [Bug Report]: generic/085 trigger a XFS panic on kernel 4.14-rc2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 02, 2017 at 09:58:49AM +1100, Dave Chinner wrote:
> On Sat, Sep 30, 2017 at 11:28:57AM +0800, Zorro Lang wrote:
> > Hi,
> > 
> > I hit a panic[1] when I ran xfstests on debug kernel v4.14-rc2
> > (with xfsprogs 4.13.1), and I can reproduce it on the same machine
> > twice. But I can't reproduce it on another machine.
> > 
> > Maybe there're some hardware specific requirement to trigger this panic. I
> > tested on normal disk partition, but the disk is multi stripes RAID device.
> > I didn't get the mkfs output of g/085, bug I found the default mkfs output
> > (mkfs.xfs -f /dev/sda3) is:
> > 
> > meta-data=/dev/sda3              isize=512    agcount=16, agsize=982528 blks
> >          =                       sectsz=512   attr=2, projid32bit=1
> >          =                       crc=1        finobt=1, sparse=0, rmapbt=0, reflink=0
> > data     =                       bsize=1024   blocks=15720448, imaxpct=25
> >          =                       sunit=512    swidth=1024 blks
> > naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
> > log      =internal log           bsize=1024   blocks=10240, version=2
> >          =                       sectsz=512   sunit=32 blks, lazy-count=1
> > realtime =none                   extsz=4096   blocks=0, rtextents=0
> 
> FWIW, I've come across a few of these log recovery crashes recently
> when reworking mkfs.xfs. The cause of them has always been either a
> log being too small or a mismatch between log size and log stripe
> unit configuration. The typical sign of that was either a
> negative buffer length like this one (XFS (dm-0): Invalid block
> length (0xfffffed8) for buffer) or the head/tail block initially
> being calculated before/after the actual log and so the log offset
> was negative.
> 
> I'm guessing the recent log validity checking we've added isn't as
> robust as it should be, but I haven't had time to dig into it yet.
> I've debugged the issues far enough to point to mkfs being wrong
> with xfs_logprint - it runs the same head/tail recovery code as the
> kernel so typically crashes on the same problems as the kernel. It's
> much easier to debug in userspace with gdb, though.....

Just to pile on with everyone else: I've noticed that fuzzing logsunit
to -1 causes the mount process to spit out a bunch of recovery-related
io errors.  Shortly thereafter the kernel crashes too.

--D

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux