What to do? Error:(Attempt to read block from filesystem resulted in short read) while doing inode scan.)

tytso at mit.edu (Theodore Ts'o) · Fri, 30 Aug 2002 08:34:40 -0400

On Fri, Aug 30, 2002 at 11:58:49AM +0100, Ed W wrote:
> Pass 1: Checking inodes, block, and sizes
> Error reading block xxxxxxx (Attempt to read block from filesystem resulted
> in short read) while doing inode scan.
> 
> I don't recall the exact block numbers (seem to have misplaced my notes
> where I wrote them down...).  However, they were reproducible every time I
> ran fsck
> 
> So the question is how do I take this forward?  I assume that it means a
> dead sector?  Is there any way to find out which files occupy that sector so
> that I can be suspicious of their quality?  

This specific error means that you had a bad block inside your inode
table.  Depending on where in the inode table you had bad block, you
could have lost as many as 32 files.  (If you saw complaints about
directory entries pointing to deleted files, that would be
confirmation.)

> Can I mark the sector as bad and recover any of the data?  

You can mark the sector as bad by doing "e2fsck -c".  You can try to
see if the disk drive will remap the sector by doing a non-destructive
read/write badblocks check.  With e2fsck versions 1.26 or later, this
can be done by using the command "e2fsck -cc".  With older versions of
e2fsprogs, you'll need to run "badblocks -n" manually.  Note that this
can be dangerous!  If your disk is starting to die, more disk activity
can make things worse, not better.  I'd normally suggest that people
do a full disk-to-disk image backup before starting.

> The only hint I really have that something is wrong is my backup via tar
> always ends with:
> 
>           tar: Error exit delayed from previous errors
> 
> (As an aside is tar a satisfactory method of doing a full backup of an ext3
> filesystem?)

Under normal circumstances yes.  A much more paranoid way to do a disk
backup, which is what I recommend in these sorts of circumstances, is
to do a disk-to-disk image backup.  In order to do this you need a
disk partition at least as big as the one which you are backing up.
You then issue this command, while the filesystem is unmounted:

	dd if=/dev/hd_old_disk of=/dev/hd_backup_disk bs=1k conv=sync,noerror

If you don't have a spare disk, consider getting one.  Disk drives are
terribly cheap these days, and the data on the disk is generally worth
a factor of 10 or 100 more than purchasing a new disk might cost.
This is why sometimes people will simply replace a disk on a drop of
the hat as soon as they start seeing soft errors (i.e., warnings from
the disk that there were read errors that were correctable using ECC),
never mind the hard errors which you're clearly seeing here.  Their
time to dick around and figure out whether or not the errors on the
disk are stable or not, or to replace the disk and deal recovering
from backups, just simply isn't worth it.  It's often better to spend
the $100 or so for a new 80 gig drive, and just move on.

> Pointers and preferably notes as to what to be careful not to do would be
> really appreciated (this is a live system and I would rather not need to
> test my backups).

Good luck!!

						- Ted