On Sat, Mar 14, 2015 at 4:09 PM, Tom Horsley <horsley1953@xxxxxxxxx> wrote: > On Sat, 14 Mar 2015 16:53:15 -0500 > Roger Heflin wrote: > >> Also usually the errors are found by linux doing a read against it, so >> there should be error messages on the reads in the messages file when >> it happened, that is usually what I use to determine what sectors are >> getting the error. > > Yea, I poked around in the logs and the very first thing > that looks like any kind of error is the smart message > showing up for the first time (and repeating every > 30 minutes since then in an attempt to fill up the logs :-). I'd say the first step is to confirm this is due to a media error rather than something else, otherwise you end up down a rat hole. The top post here is a good example of a URE due to media error. http://ubuntuforums.org/archive/index.php/t-1034762.html If the drive is attempting a recovery longer than 30 seconds, you'll get errors along these lines (this is a write example, which is really bad, the read version is more common). [ 2161.457698] ata8.00: exception Emask 0x0 SAct 0x7ff SErr 0x0 action 0x6 frozen [ 2161.457709] ata8.00: failed command: WRITE FPDMA QUEUED [ 2161.457718] ata8.00: cmd 61/00:00:80:c4:2c/02:00:1e:00:00/40 tag 0 ncq 262144 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [ 2161.457723] ata8.00: status: { DRDY } ... [ 5628.308982] ata8.00: failed command: WRITE FPDMA QUEUED [ 5628.308990] ata8.00: cmd 61/80:50:80:34:44/01:00:50:00:00/40 tag 10 ncq 196608 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [ 5628.308993] ata8.00: status: { DRDY } [ 5628.309000] ata8: hard resetting link [ 5638.311674] ata8: softreset failed (1st FIS failed) [ 5638.311686] ata8: hard resetting link This is a how to on what to do about bad sectors, including partial recovery. http://www.smartmontools.org/browser/trunk/www/badblockhowto.xml But the tl;dr for all of that, in my opinion, is to update your backups, and then obliterate the drive with writes. Only on a write does the firmware determine if sector problems are transient or persistent. If it's a persistent problem, then the LBA is reassigned to a reserve sector. Once this is all done, then you can restore from backups. To do the write correctly, first you have to know if you have a 512n or 512e drive. Most drives these days are 512e, or 512 byte logical, 4096 byte physical. The LBA error is for the first logical sector in the bad physical sector. So writing over that 512 byte sector will not work (it'll fail as a read error even though you're writing, due to a read-modify-write attempt by the drive firmware). 'parted -l' will tell you what type of drive you have is. What I suggest is this: # badblocks -b 4096 -svw /dev/sdX This is destructive! Note that any block numbers that are reported by badblocks at predicated on the -b value. So the reported value isn't a sector LBA value. You have to multiply by 8 to get LBA. But after this cycles through even once, the problem should be resolved. You could let it run through all 8 passes (or whatever it is). What ought to be true is you either get no errors (meaning all read errors weren't media errors they were just bad data, like from torn writes or something) or you get some write errors with reallocations on the first pass. And no errors for subsequent passes. If any subsequent passes have errors, especially corruption errors, then get rid of the drive or turn it into a play thing or send it to me :-D -- Chris Murphy -- users mailing list users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe or change subscription options: https://admin.fedoraproject.org/mailman/listinfo/users Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines Have a question? Ask away: http://ask.fedoraproject.org