Am Montag, den 31.07.2006, 05:22 +0900 schrieb Tejun Heo: > Martin Ammermüller wrote: > > With high disk I/O and a 2.6.18-rc1 kernel i get these errors (depending > > upon the work i do, up to several times a day): > > > > ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x400000 action 0x2 frozen > > ata1.00: (BMDMA stat 0x20) > > ata1.00: tag 0 cmd 0xc8 Emask 0x2 stat 0x58 err 0x0 (HSM violation) > > Hmm... Interesting. It gets HSM violation first. > > > ata1: soft resetting port > > ata1: port is slow to respond, please be patient > > ata1: port failed to respond (30 secs) > > ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) > > ATA: abnormal status 0xD8 on port 0xDCA18087 > > ATA: abnormal status 0xD8 on port 0xDCA18087 > > ATA: abnormal status 0xD8 on port 0xDCA18087 > > ATA: abnormal status 0xD8 on port 0xDCA18087 > > ATA: abnormal status 0xD8 on port 0xDCA18087 > > ata1.00: qc timeout (cmd 0xec) > > ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4) > > ata1.00: revalidation failed (errno=-5) > > ata1: failed to recover some devices, retrying in 5 secs > > ata1: hard resetting port > > ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) > > ata1.00: configured for UDMA/100 > > ata1: EH complete > > Then two timeouts while recovering. > > > SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) > > sda: Write Protect is off > > sda: Mode Sense: 00 3a 00 00 > > SCSI device sda: drive cache: write back > > > >> Anyways, if your harddisk is doing this regularly, > >> your hardware is faulty. Maybe the connection between the controller > >> and the disk is the problem or the disk itself. > > > > I did not get those errors with Windows XP and i am not the only one who > > has problems running this particular laptop model with a linux kernel. > > Ok, to be honest, there's actually only one person i know of which > > bothered enough about exactly the same errors to send me an e-mail (he > > discovered at least one of my messages to this list). But in my > > experience there are almost always others getting the same error, but > > which remain silent. > > It might be that the drive is quirky and raises interrupts prematurely > sometimes. Depending on how the driver performs recovery, the effect > can be hidden from user. Can you try the attached patch and see how the > kernel acts? I tried the patch, but i couldn't see any changes in kerneloutput. I also noticed, that there are actually two slightly different error-messages. #1 (shorter one, without HSM violation): ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: (BMDMA stat 0x21) ata1.00: tag 0 cmd 0xc8 Emask 0x4 stat 0x40 err 0x0 (timeout) ata1: port is slow to respond, please be patient ata1: port failed to respond (30 secs) ata1: soft resetting port ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata1.00: configured for UDMA/100 ata1: EH complete SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) sda: Write Protect is off sda: Mode Sense: 00 3a 00 00 SCSI device sda: drive cache: write back #2 (longer, with HSM violation): ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x400000 action 0x2 frozen ata1.00: (BMDMA stat 0x20) ata1.00: tag 0 cmd 0xc8 Emask 0x2 stat 0x58 err 0x0 (HSM violation) ata1: soft resetting port ata1: port is slow to respond, please be patient ata1: port failed to respond (30 secs) ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ATA: abnormal status 0xD8 on port 0xDCA44087 ATA: abnormal status 0xD8 on port 0xDCA44087 ATA: abnormal status 0xD8 on port 0xDCA44087 ATA: abnormal status 0xD8 on port 0xDCA44087 ATA: abnormal status 0xD8 on port 0xDCA44087 ata1.00: qc timeout (cmd 0xec) ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4) ata1.00: revalidation failed (errno=-5) ata1: failed to recover some devices, retrying in 5 secs ata1: hard resetting port ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata1.00: configured for UDMA/100 ata1: EH complete SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) sda: Write Protect is off sda: Mode Sense: 00 3a 00 00 SCSI device sda: drive cache: write back Additionally, i attached the output of "smartctl --all -d ata /dev/sda" Regards, Martin Ammermüller
smartctl version 5.34 [i686-pc-linux-gnu] Copyright (C) 2002-5 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Device Model: TOSHIBA MK8032GSX Serial Number: 26GI5560S Firmware Version: AS111G User Capacity: 80.026.361.856 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 6 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Sat Aug 5 15:33:19 2006 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 331) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 65) minutes. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0 2 Throughput_Performance 0x0005 100 100 050 Pre-fail Offline - 0 3 Spin_Up_Time 0x0027 100 100 001 Pre-fail Always - 2104 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 482 5 Reallocated_Sector_Ct 0x0033 100 100 050 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 100 100 050 Pre-fail Offline - 0 9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 1729 10 Spin_Retry_Count 0x0033 109 100 030 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 394 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 97 193 Load_Cycle_Count 0x0032 089 089 000 Old_age Always - 111507 194 Temperature_Celsius 0x0022 100 100 000 Old_age Always - 37 (Lifetime Min/Max 15/50) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 220 Disk_Shift 0x0002 100 100 000 Old_age Always - 122 222 Loaded_Hours 0x0032 098 098 000 Old_age Always - 1168 223 Load_Retry_Count 0x0032 100 100 000 Old_age Always - 0 224 Load_Friction 0x0022 100 100 000 Old_age Always - 0 226 Load-in_Time 0x0026 100 100 000 Old_age Always - 324 240 Head_Flying_Hours 0x0001 100 100 001 Pre-fail Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 1729 - # 2 Short offline Completed without error 00% 1728 - # 3 Short offline Completed without error 00% 1128 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
Attachment:
signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil