On Thu, Jul 8, 2010 at 9:21 PM, Shaun Adolphson <shaun@xxxxxxxxxxxxx> wrote: > On Wed, Jul 7, 2010 at 9:18 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: >> >> On Tue, Jul 06, 2010 at 08:57:45PM +1000, Shaun Adolphson wrote: >> > Hi, >> > >> > We have been able to repeatably produce xfs internal errors >> > (XFS_WANT_CORRUPTED_GOTO) on one of our fileservers. We are attempting >> > to locally copy a 248Gig file off a usb drive formated as NTFS to the >> > xfs drive. The copy gets about 96% of the way through and we get the >> > following messages: >> > >> > Jun 28 22:14:46 terrorserver kernel: XFS internal error >> > XFS_WANT_CORRUPTED_GOTO at line 2092 of file fs/xfs/xfs_bmap_btree.c. >> > Caller 0xffffffff8837446f >> >> Interesting. That's a corrupted inode extent btree - I haven't seen >> one of them for a long while. Were there any errors (like IO errors) >> reported before this? >> >> However, the first step is to determine if the error is on disk or an >> in-memory error. Can you post output of: >> >> - xfs_info <mntpt> meta-data=/dev/TerrorVolume/terror isize=256 agcount=130385, agsize=32768 blks = sectsz=512 attr=1 data = bsize=4096 blocks=4272433152, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal bsize=4096 blocks=2560, version=1 = sectsz=512 sunit=0 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 >> - xfs_repair -n after a shutdown The out out of the xfs_repair -n is 6mb, below is the condensed version. I can post the whole output if required. Phase 1 - find and verify superblock... Phase 2 - using internal log - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 . . . - agno = 130384 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. >> >> Can you upgrade xfsprogs (i.e. xfs_repair) to the latest version >> (3.1.2) before you do this as well? # xfs_repair -V xfs_repair version 3.1.2 > > We have upgraded the xfsprogs to 3.1.2 and in the process of > collecting the required infomation. > >> >> > We have reproduced the condition 3 times and each time we have been >> > able to remount the drive ( to replay the transaction log ) and then >> > preform and xfs_repair. >> > >> > We are just using cp to copy the file. >> > >> > Some further details about the system: >> > >> > Software: >> > - Fresh install of CentOS 5.5 64bit all patches up to date >> > - Kernel 2.6.18-194.3.1.el5.centos.plus >> >> I've got no idea exactly what version of XFS that has in it, so I >> can't say off the top of my head whether this is a fixed bug or not. >> >> Cheers, >> >> Dave. >> -- >> Dave Chinner >> david@xxxxxxxxxxxxx > > > > During other testing we have also been able to reproduce the issue by > copying a self generated 248Gig file from another system disk to the > XFS disk. The file was generated using dd with an input of /dev/zero. > > All the existing data (~6TB ) was successfully copied onto the storage > with out have the error. The thing to note is that all the existing > files are much smaller than the one that we are trying to copy in ( > 248Gig ). And since we have been having the shutdown we have copied > many smaller files ( files < 30Gig in size ) onto the storage area > with out issue > Thanks, Shaun _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs