On Mon, Apr 18, 2011 at 09:24:22PM +0200, Anisse Astier wrote: > Hi, > > (first of all, I'm not subscribed to the list, Please cc-me on all replies) > > On an ARM NAS, using kernel 2.6.36.2 I managed to crash my root xfs partition. > > xfs_repair cannot then repair this partition and is crashing itself. > > # xfs_info /dev/sda2 > meta-data=/dev/sda2 isize=256 agcount=32, agsize=7615249 blks > = sectsz=512 attr=1 > data = bsize=4096 blocks=243687968, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 ascii-ci=0 > log =internal bsize=4096 blocks=32768, version=1 > = sectsz=512 sunit=0 blks, lazy-count=0 > realtime =none extsz=65536 blocks=0, rtextents=0 > > > > I did a SMART test to ensure the disk didn't have any bad block: > SMART Error Log Version: 1 > No Errors Logged > > SMART Self-test log structure revision number 1 > Num Test_Description Status Remaining > LifeTime(hours) LBA_of_first_error > # 1 Extended offline Completed without error 00% 8327 - > > The dmesg log (on another recovery system with kernel 2.6.36-rc2) ; I > tried to mount the system : > [ 1003.257446] XFS mounting filesystem sda2 > [ 1003.301519] Starting XFS recovery on filesystem: sda2 (logdev: internal) > [ 1003.303068] XFS: bad number of regions (28024) in inode log format > [ 1003.303142] XFS: log mount/recovery failed: error 5 > [ 1003.303419] XFS: log mount failed Something has corrupted the log.... > I then had no other choice than suppressing the log with xfs_repair -L. Yup. > xfs_repair crashed, but I was able to mount the filesystem(ro), but > once I tried accessing the corrupt files, xfs would go mad: > [13717.138896] UDF-fs: No partition found (1) > [13717.202112] XFS mounting filesystem sda2 > [13717.274885] Ending clean XFS mount for filesystem: sda2 > [43969.970648] sshd (1039): /proc/1039/oom_adj is deprecated, please > use /proc/1039/oom_score_adj instead. > [107180.252602] Filesystem "sda2": corrupt dinode 805341224, (btree > extents). Unmount and run xfs_repair. Quite likely, zeroing the log effectively corrupts the filesystem. ..... > directory flags set on non-directory inode 2283178100, would fix bad flags. > bad key in bmbt root (is 73434, would reset to 74194) in inode > 2283178100 data fork > bad fwd (right) sibling pointer (saw 145202888 should be NULLDFSBNO) > Segmentation fault Hmmm. The very next line doesn't appear before the segfault, making me think that it's the printf that is causing it to crash. if (check_dups == 0 && cursor.level[0].right_fsbno != NULLDFSBNO) { do_warn( _("bad fwd (right) sibling pointer (saw %llu should be NULLDFSBNO)\n"), cursor.level[0].right_fsbno); We get this line of output. do_warn( _("\tin inode %u (%s fork) bmap btree block %llu\n"), XFS_AGINO_TO_INO(mp, agno, ino), forkname, cursor.level[0].fsbno); But not this one. I wonder if passing a 64bit number to a %u format string (shoul dbe %llu) causes problems on ARM? All the variables are valid as they are printed or accessed elsewhere in the function, so that's the only thing I can think of without a stack trace to tell me otherwise.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs