Re: [PATCH 8/9] xfs: fuzz every field of every structure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 09, 2016 at 01:25:23AM -0800, Darrick J. Wong wrote:
> On Wed, Nov 09, 2016 at 05:13:44PM +0800, Eryu Guan wrote:
> > On Wed, Nov 09, 2016 at 12:52:36AM -0800, Darrick J. Wong wrote:
> > > On Wed, Nov 09, 2016 at 04:09:24PM +0800, Eryu Guan wrote:
> > > > On Fri, Nov 04, 2016 at 05:18:00PM -0700, Darrick J. Wong wrote:
> > > > > Previously, our XFS fuzzing efforts were limited to using the xfs_db
> > > > > blocktrash command to scribble garbage all over a block.  This is
> > > > > pretty easy to discover; it would be far more interesting if we could
> > > > > fuzz individual fields looking for unhandled corner cases.  Since we
> > > > > now have an online scrub tool, use it to check for our targeted
> > > > > corruptions prior to the usual steps of writing to the FS, taking it
> > > > > offline, repairing, and re-checking.
> > > > > 
> > > > > These tests use the new xfs_db 'fuzz' command to test corner case
> > > > > handling of every field.  The 'print' command tells us which fields
> > > > > are available, and the fuzz command can write zeroes or ones to the
> > > > > field; set the high, middle, or low bit; add or subtract numbers; or
> > > > > randomize the field.  We loop through all fields and all fuzz verbs to
> > > > > see if we can trip up the kernel.
> > > > > 
> > > > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > > > 
> > > > The first test gave me a kernel crash :) xfs/1300 crashed your kernel
> > > > djwong-devel branch. I appended the console log at the end of this mail
> > > > if you have interest to see it.
> > > > 
> > > > And another xfs/1300 run gave me this failure message:
> > > > 
> > > >     +/mnt/testarea/scratch: Kernel lacks GETFSMAP; scrub will be less efficient. (xfs.c line 661)
> > > >     +/mnt/testarea/scratch: Kernel cannot help scrub metadata; scrub will be incomplete. (xfs.c line 661)
> > > >     +/mnt/testarea/scratch: Kernel cannot help scrub inodes; scrub will be incomplete. (xfs.c line 661)
> > > >     +/mnt/testarea/scratch: Kernel cannot help scrub extent map; scrub will be less efficient. (xfs.c line 661)
> > > > 
> > > > Is this known issue or something should be filtered out in the test?
> > > 
> > > That's strange, the djwong-devel branch should have getfsmap & scrub in it...
> > > 
> > > ...are you running the djwong-devel kernel and xfsprogs code?  The scrub
> > > ioctl structure has shifted some over the past few months, though GETFSMAP
> > > hasn't changed in ages.
> > > 
> > > Wait, "another xfs/1300 run" ... so after the first crash, did you go
> > > back to a vanilla kernel without all my crazypatches? :)
> > 
> > Ahh, you're right! It booted into 4.9-rc4 vanilla kernel, sorry about
> > that.. But xfs/1300 crashed djwong-devel for the second time in my
> > second try, seems the crash is reliable reproduced, with reflink
> > enabled.
> 
> I think if you change the XFS_SCRUB_OP_ERROR_GOTO at line 2237 of
> xfs_scrub_get_inode() to "if (error) goto out_err;" that ought to clear it up.
> 
> > > > And ext4/1300 generated large .out.bad file (51M), containing something
> > > > like:
> > > > 
> > > > +/mnt/testarea/scratch/test/68/S_IFREG.FMT_ETREE: extent (1101381632/2469888/4096) ends past end of filesystem at 31457280. (generic.c line 272)
> > > > +/mnt/testarea/scratch/test/68/S_IFREG.FMT_ETREE: extent (1101389824/2478080/4096) starts past end of filesystem at 31457280. (generic.c line 264)
> > > > +/mnt/testarea/scratch/test/68/S_IFREG.FMT_ETREE: extent (1101389824/2478080/4096) ends past end of filesystem at 31457280. (generic.c line 272)
> > > > +/mnt/testarea/scratch/test/68/S_IFREG.FMT_ETREE: extent (1101398016/2486272/4096) starts past end of filesystem at 31457280. (generic.c line 264)
> > > > +/mnt/testarea/scratch/test/68/S_IFREG.FMT_ETREE: extent (1101398016/2486272/4096) ends past end of filesystem at 31457280. (generic.c line 272)
> > > > +/mnt/testarea/scratch/test/68/S_IFREG.FMT_ETREE: extent (1101406208/2494464/4096) starts past end of filesystem at 31457280. (generic.c line 264)
> > > > +/mnt/testarea/scratch/test/68/S_IFREG.FMT_ETREE: extent (1101406208/2494464/4096) ends past end of filesystem at 31457280. (generic.c line 272)
> > > > +/mnt/testarea/scratch/test/68/S_IFREG.FMT_ETREE: extent (1101414400/2502656/4096) starts past end of filesystem at 31457280. (generic.c line 264)
> > > > +/mnt/testarea/scratch/test/68/S_IFREG.FMT_ETREE: extent (1101414400/2502656/4096) ends past end of filesystem at 31457280. (generic.c line 272)
> > > > 
> > > > Seems like scrub found something wrong (real problems) and became very
> > > > noisy?
> > > 
> > > Hmm that's even stranger.  I'll try to reproduce tomorrow.
> > 
> > So this ext4 noise came from the vanilla kernel too, retested with
> > djwong-devel kernel & userspace ext4/1300 passed without problems. Sorry
> > for my noise..
> 
> But that's even more weird; there haven't been any changes to ext4 that
> would explain why this breaks on a vanilla 4.9-rc4 kernel...

Puzzle resolved, I somehow switched back to mainline xfsprogs or some
other wrong xfsprogs version after booted into 4.9-rc4 vanilla kernel.
After updating xfsprogs to djwong-devel, ext4/1300 showed no problem on
4.9-rc4 kernel too.

Sorry again for the mess!

Eryu
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux