On Mon, 30 Jun 2008 23:00:18 BST, Duane Griffin said: > ext3_dx_find_entry uses ext3_next_entry without verifying that the entry is > valid. If its rec_len == 0 this causes an infinite loop. Refactor the loop > to check the validity of entries before checking whether they match and > moving onto the next one. This may or may not be related, but I've managed to hit another interesting piece of ext3 damage while running 26-rc8-mmotd-0701: % /bin/ls -l /usr/share/man/man5 | grep lvm /bin/ls: cannot access /usr/share/man/man5/lvm.conf.5.gz: Stale NFS file handle -????????? ? ? ? ? ? lvm.conf.5.gz Yes, that *is* on an ext3 filesystem. debugfs on /usr/share is interesting: debugfs: stat /man/man5/lvm.conf.5.gz Inode: 59918 Type: regular Mode: 0644 Flags: 0x0 Generation: 4228691378 Version: 0x00000000 User: 0 Group: 0 Size: 0 File ACL: 239201 Directory ACL: 0 Links: 0 Blockcount: 0 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x486c6c0b -- Thu Jul 3 02:04:59 2008 atime: 0x47efcad7 -- Sun Mar 30 13:16:07 2008 mtime: 0x486c6c0b -- Thu Jul 3 02:04:59 2008 dtime: 0x486c6c0b -- Thu Jul 3 02:04:59 2008 BLOCKS: Zero links, even though man/man5 references it. and the ctime/mtime/dtime are suspicious as well - that file belongs to an RPM that was last updated back on June 20, and there's no obvious culprit processes in lastcomm that were running at 2:04AM, and none of the current ones look obvious either. (system was booted at 00:21, so the failure happened about 1 hours 40 mins after the current kernel launched). Nothing in dmesg from around 2:04AM, and nothing around when the /bin/ls is run. An 'ls -lR /usr/share' shows that the *other* 127,619 files on the filesystem are all OK, it's just this one. Any brilliant ideas on how to track this down further?
Attachment:
pgpaaXTTyFEf6.pgp
Description: PGP signature