2009/9/18 Andreas Dilger <adilger@xxxxxxx>: > > This isn't even safe on an UNMOUNTED filesystem, since "badblocks" > by default does destructive testing of the block device. With destructive testing, I think you mean here a read/write test, since a read only test isn't supposed to be destructive (usually). According to badblocks(8), option -n, "By default only a non-destructive read-only test is done". Moreover, according to fsck.ext4(8), option -c, "This option causes e2fsck to use badblocks(8) program to do a read-only scan of the device in order to find any bad blocks" and later "If this option is specified twice, then the bad block scan will be done using a non-destructive read-write test". So I think the *potentially* unsafe command you meant was "fsck.ext4 -n -c -c device". Assuming that the manual is correct, and "fsck.ext4 -n -c device" does really perform a read-only test opening the fs just to update the bad blocks inode, my question still persists: is safe to launch it weekly on a mounted filesystem? The wording of the manual seems to tell "yes, it's supposed to be safe but don't do it because of <unexplained reason>" :-) > Since most > disks will internally relocate bad blocks on writes, it is very > unlikely that "badblocks" will ever find a problem on a new disk. > I'd like to believe you but please read the "smartctl --all" output (attached) for a Toshiba 120GB notebook drive I recently replaced, or just observe this excerpt: 5 Reallocated_Sector_Ct 0x0033 100 100 050 Pre-fail Always - 2 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 2 .... Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed: read failure 00% 6366 57398211 # 2 Extended offline Completed: read failure 00% 6350 57398211 So, just 2 sectors reallocated but still read failures that are visible on the linux block device layer. I can guarantee this: I extensively repeated read tests on the disk, no way I could force the drive to relocate more failing sectors using its own SMART mechanism. So, what I mean is that hw bad blocks relocate features could not work as expected even on modern drives. Because of bugged implementation? Don't know. You didn't answer my main question: does ext4 do something in case of a read/write failure that is detected in the block device layer? Exotic filesystems like NTFS (when running Windows, sure) seems to update its bad blocks list online, so it doesn't seems a bad think for notebook/desktop users. The same problem is open for DM users: since evms is deprecated, there's no more a BBR target. So, for example, your buggy hard drive doesn't intercept the first and the only failing sector? The error arrives in the block device layer and the failing drive is deactived/removed from the RAID volume. Not good for me to throw away a disk for just one failing sector. This is matter for another mailing list, so please ignore. Regards, Francesco
Attachment:
smartctl-all
Description: Binary data