On Fri, Aug 30, 2002 at 11:58:49AM +0100, Ed W wrote: > Pass 1: Checking inodes, block, and sizes > Error reading block xxxxxxx (Attempt to read block from filesystem resulted > in short read) while doing inode scan. > > I don't recall the exact block numbers (seem to have misplaced my notes > where I wrote them down...). However, they were reproducible every time I > ran fsck > > So the question is how do I take this forward? I assume that it means a > dead sector? Is there any way to find out which files occupy that sector so > that I can be suspicious of their quality? This specific error means that you had a bad block inside your inode table. Depending on where in the inode table you had bad block, you could have lost as many as 32 files. (If you saw complaints about directory entries pointing to deleted files, that would be confirmation.) > Can I mark the sector as bad and recover any of the data? You can mark the sector as bad by doing "e2fsck -c". You can try to see if the disk drive will remap the sector by doing a non-destructive read/write badblocks check. With e2fsck versions 1.26 or later, this can be done by using the command "e2fsck -cc". With older versions of e2fsprogs, you'll need to run "badblocks -n" manually. Note that this can be dangerous! If your disk is starting to die, more disk activity can make things worse, not better. I'd normally suggest that people do a full disk-to-disk image backup before starting. > The only hint I really have that something is wrong is my backup via tar > always ends with: > > tar: Error exit delayed from previous errors > > (As an aside is tar a satisfactory method of doing a full backup of an ext3 > filesystem?) Under normal circumstances yes. A much more paranoid way to do a disk backup, which is what I recommend in these sorts of circumstances, is to do a disk-to-disk image backup. In order to do this you need a disk partition at least as big as the one which you are backing up. You then issue this command, while the filesystem is unmounted: dd if=/dev/hd_old_disk of=/dev/hd_backup_disk bs=1k conv=sync,noerror If you don't have a spare disk, consider getting one. Disk drives are terribly cheap these days, and the data on the disk is generally worth a factor of 10 or 100 more than purchasing a new disk might cost. This is why sometimes people will simply replace a disk on a drop of the hat as soon as they start seeing soft errors (i.e., warnings from the disk that there were read errors that were correctable using ECC), never mind the hard errors which you're clearly seeing here. Their time to dick around and figure out whether or not the errors on the disk are stable or not, or to replace the disk and deal recovering from backups, just simply isn't worth it. It's often better to spend the $100 or so for a new 80 gig drive, and just move on. > Pointers and preferably notes as to what to be careful not to do would be > really appreciated (this is a live system and I would rather not need to > test my backups). Good luck!! - Ted