On 08/26/2013 12:13 AM, Dave Chinner wrote: > On Thu, Aug 22, 2013 at 02:28:00PM -0400, Brian Foster wrote: >> Hi all, >> >> I hit an assert on a debug kernel while beating on some finobt work and >> eventually reproduced it on unmodified/TOT xfs/xfsprogs as of today. I >> hit it through a couple different paths, first while running fsstress on >> a CRC enabled filesystem (with otherwise default mkfs options): >> >> (These tests are running on a 4p, 4GB VM against a 100GB virtio disk, >> hosted on a single spindle desktop box). >> >> crc=1 >> fsstress -z -fsymlink=1 -n99999999 -p4 -d /mnt/test >> >> XFS: Assertion failed: first <= last && last < BBTOB(bp->b_length), > > Directory buffer overrun. > >> [<ffffffffa031d549>] xfs_trans_log_buf+0x89/0x1b0 [xfs] >> [<ffffffffa02e7c1c>] xfs_da3_node_add+0x11c/0x210 [xfs] >> [<ffffffffa02ea703>] xfs_da3_node_split+0xc3/0x230 [xfs] >> [<ffffffffa02eaa18>] xfs_da3_split+0x1a8/0x410 [xfs] >> [<ffffffffa02f743f>] xfs_dir2_node_addname+0x47f/0xde0 [xfs] > > During a split. > > Easily reproduced with "seq 200000 | xargs touch" as Michael Semon > reported last week. > > The fix demonstrates my concerns about modifying directory code - > the CRC changes missed a *fundamental* directory format definition, > and we've only just tripped over it.... Don't fret too much over it. This test was part of coreutils, which is something that I rebuild after a glibc upgrade. Had glibc-2.18 been released six weeks ago, then I would have stumbled onto this XFS issue six weeks ago. >> rm -rf /mnt/test >> >> XFS: Assertion failed: first <= last && last < BBTOB(bp->b_length), > > Directory buffer overrun. > >> [<ffffffffa032b549>] xfs_trans_log_buf+0x89/0x1b0 [xfs] >> [<ffffffffa02f61ff>] xfs_da3_node_unbalance+0xef/0x1d0 [xfs] >> [<ffffffffa02f98b0>] xfs_da3_join+0x240/0x290 [xfs] >> [<ffffffffa030659b>] xfs_dir2_node_removename+0x69b/0x8b0 [xfs] > > During a merge. Not sure why that is happening on a v4 filesystem. > V5 filesystem, yes, due to the above bug but v4 should not be > affected. > > Cheers, > > Dave. Your patch looks good, and I even applied it to vanilla 3.10.9, along with Jeff Liu's MAX_LFS_FILESIZE patch. [Murphy's Law states that if I didn't use Jeff's patch, then I would run xfstests generic/308 on accident, leading to a hung umount. Happens every single time.] Both patches applied cleanly to kernels on a 2.8 GHz i686 Pentium 4 PC that was running Slackware 14.0 Linux. Naturally, `seq 200000 | xargs touch` was run for v5 and v4 XFS file systems. All was okay. The removal of the populated directory went fine as well. The v5 file systems were tested using a 3.11-rc7+ git kernel. xfstests was run from the start of generic/ through generic/127; and that went fine. Some of the xfs/* series was run but merely scanned because the v5-output-cleanup patches were not readily available. The v4 file systems were tested with a patched vanilla 3.10.9 kernel, and some of generic was run, with patched and unpatched kernels showing the same good results, very little difference in timing overall. Thanks! Michael _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs