On 31 July 2011 18:59, Mathias Burén <mathias.buren@xxxxxxxxx> wrote: > On 31 July 2011 14:05, Mathias Burén <mathias.buren@xxxxxxxxx> wrote: >> Hi list, >> >> Here's the output of my weekly script: >> >> DEV EVENTS REALL PEND UNCORR CRC RAW ZONE END >> sdb1 6158767 0 0 0 2 0 0 >> sdc1 6158767 0 0 0 0 0 0 >> sdd1 6158767 0 0 0 0 0 0 >> sde1 6158767 0 0 0 0 0 1 >> sdf1 6158767 0 0 0 0 47 6 >> sdg1 6158767 0 0 0 0 0 0 >> sdh1 6158767 0 6 0 0 340 3 >> >> >> Personalities : [raid6] [raid5] [raid4] >> md0 : active raid6 sdf1[5] sdh1[6] sdg1[0] sde1[7] sdc1[3] sdd1[4] sdb1[1] >> 9751756800 blocks super 1.2 level 6, 64k chunk, algorithm 2 >> [7/7] [UUUUUUU] >> >> unused devices: <none> >> >> >> /dev/md0: >> Version : 1.2 >> Creation Time : Tue Oct 19 08:58:41 2010 >> Raid Level : raid6 >> Array Size : 9751756800 (9300.00 GiB 9985.80 GB) >> Used Dev Size : 1950351360 (1860.00 GiB 1997.16 GB) >> Raid Devices : 7 >> Total Devices : 7 >> Persistence : Superblock is persistent >> >> Update Time : Sun Jul 31 09:50:43 2011 >> State : clean >> Active Devices : 7 >> Working Devices : 7 >> Failed Devices : 0 >> Spare Devices : 0 >> >> Layout : left-symmetric >> Chunk Size : 64K >> >> Name : ion:0 (local to host ion) >> UUID : e6595c64:b3ae90b3:f01133ac:3f402d20 >> Events : 6158767 >> >> Number Major Minor RaidDevice State >> 0 8 97 0 active sync /dev/sdg1 >> 1 8 17 1 active sync /dev/sdb1 >> 4 8 49 2 active sync /dev/sdd1 >> 3 8 33 3 active sync /dev/sdc1 >> 5 8 81 4 active sync /dev/sdf1 >> 6 8 113 5 active sync /dev/sdh1 >> 7 8 65 6 active sync /dev/sde1 >> >> Here's the SMART data for sdh: >> >> >> smartctl 5.41 2011-06-09 r3365 [x86_64-linux-2.6.39-ck] (local build) >> Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net >> >> === START OF INFORMATION SECTION === >> Model Family: SAMSUNG SpinPoint F4 EG (AFT) >> Device Model: SAMSUNG HD204UI >> Serial Number: S2HGJ1RZ800850 >> LU WWN Device Id: 5 0024e9 003f1ebc9 >> Firmware Version: 1AQ10003 >> User Capacity: 2,000,398,934,016 bytes [2.00 TB] >> Sector Size: 512 bytes logical/physical >> Device is: In smartctl database [for details use: -P show] >> ATA Version is: 8 >> ATA Standard is: ATA-8-ACS revision 6 >> Local Time is: Sun Jul 31 14:03:32 2011 IST >> >> ==> WARNING: Using smartmontools or hdparm with this >> drive may result in data loss due to a firmware bug. >> ****** THIS DRIVE MAY OR MAY NOT BE AFFECTED! ****** >> Buggy and fixed firmware report same version number! >> See the following web pages for details: >> http://www.samsung.com/global/business/hdd/faqView.do?b2b_bbs_msg_id=386 >> http://sourceforge.net/apps/trac/smartmontools/wiki/SamsungF4EGBadBlocks >> >> SMART support is: Available - device has SMART capability. >> SMART support is: Enabled >> >> === START OF READ SMART DATA SECTION === >> SMART overall-health self-assessment test result: PASSED >> >> General SMART Values: >> Offline data collection status: (0x82) Offline data collection activity >> was completed without error. >> Auto Offline Data Collection: Enabled. >> Self-test execution status: ( 37) The self-test routine was interrupted >> by the host with a hard or soft reset. >> Total time to complete Offline >> data collection: (20640) seconds. >> Offline data collection >> capabilities: (0x5b) SMART execute Offline immediate. >> Auto Offline data collection on/off support. >> Suspend Offline collection upon new >> command. >> Offline surface scan supported. >> Self-test supported. >> No Conveyance Self-test supported. >> Selective Self-test supported. >> SMART capabilities: (0x0003) Saves SMART data before entering >> power-saving mode. >> Supports SMART auto save timer. >> Error logging capability: (0x01) Error logging supported. >> General Purpose Logging supported. >> Short self-test routine >> recommended polling time: ( 2) minutes. >> Extended self-test routine >> recommended polling time: ( 255) minutes. >> SCT capabilities: (0x003f) SCT Status supported. >> SCT Error Recovery Control supported. >> SCT Feature Control supported. >> SCT Data Table supported. >> >> SMART Attributes Data Structure revision number: 16 >> Vendor Specific SMART Attributes with Thresholds: >> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE >> UPDATED WHEN_FAILED RAW_VALUE >> 1 Raw_Read_Error_Rate 0x002f 100 100 051 Pre-fail >> Always - 340 >> 2 Throughput_Performance 0x0026 055 053 000 Old_age >> Always - 18989 >> 3 Spin_Up_Time 0x0023 067 044 025 Pre-fail >> Always - 10165 >> 4 Start_Stop_Count 0x0032 100 100 000 Old_age >> Always - 18 >> 5 Reallocated_Sector_Ct 0x0033 252 252 010 Pre-fail >> Always - 0 >> 7 Seek_Error_Rate 0x002e 252 252 051 Old_age >> Always - 0 >> 8 Seek_Time_Performance 0x0024 252 252 015 Old_age >> Offline - 0 >> 9 Power_On_Hours 0x0032 100 100 000 Old_age >> Always - 6447 >> 10 Spin_Retry_Count 0x0032 252 252 051 Old_age >> Always - 0 >> 11 Calibration_Retry_Count 0x0032 252 252 000 Old_age >> Always - 0 >> 12 Power_Cycle_Count 0x0032 100 100 000 Old_age >> Always - 20 >> 181 Program_Fail_Cnt_Total 0x0022 100 100 000 Old_age >> Always - 10117271 >> 191 G-Sense_Error_Rate 0x0022 100 100 000 Old_age >> Always - 1 >> 192 Power-Off_Retract_Count 0x0022 252 252 000 Old_age >> Always - 0 >> 194 Temperature_Celsius 0x0002 064 057 000 Old_age >> Always - 35 (Min/Max 16/43) >> 195 Hardware_ECC_Recovered 0x003a 100 100 000 Old_age >> Always - 0 >> 196 Reallocated_Event_Count 0x0032 252 252 000 Old_age >> Always - 0 >> 197 Current_Pending_Sector 0x0032 100 100 000 Old_age >> Always - 6 >> 198 Offline_Uncorrectable 0x0030 252 252 000 Old_age >> Offline - 0 >> 199 UDMA_CRC_Error_Count 0x0036 200 200 000 Old_age >> Always - 0 >> 200 Multi_Zone_Error_Rate 0x002a 100 100 000 Old_age >> Always - 3 >> 223 Load_Retry_Count 0x0032 252 252 000 Old_age >> Always - 0 >> 225 Load_Cycle_Count 0x0032 100 100 000 Old_age >> Always - 21 >> >> SMART Error Log Version: 1 >> No Errors Logged >> >> SMART Self-test log structure revision number 1 >> Num Test_Description Status Remaining >> LifeTime(hours) LBA_of_first_error >> # 1 Extended offline Interrupted (host reset) 50% 6408 - >> # 2 Extended offline Completed without error 00% 6317 - >> # 3 Extended offline Completed without error 00% 6260 - >> # 4 Extended offline Completed without error 00% 6232 - >> # 5 Extended offline Completed without error 00% 6170 - >> # 6 Extended offline Completed without error 00% 6064 - >> # 7 Extended offline Completed without error 00% 6029 - >> # 8 Extended offline Completed without error 00% 5898 - >> # 9 Extended offline Aborted by host 60% 5893 - >> #10 Extended offline Completed without error 00% 5728 - >> #11 Extended offline Completed without error 00% 5706 - >> #12 Extended offline Interrupted (host reset) 40% 5701 - >> #13 Extended offline Interrupted (host reset) 90% 5666 - >> #14 Extended offline Completed without error 00% 5560 - >> #15 Extended offline Completed without error 00% 5527 - >> #16 Extended offline Completed without error 00% 5392 - >> #17 Extended offline Completed without error 00% 5357 - >> #18 Extended offline Completed without error 00% 5250 - >> #19 Extended offline Completed without error 00% 4272 - >> #20 Extended offline Completed without error 00% 4017 - >> #21 Extended offline Completed without error 00% 3935 - >> >> Note: selective self-test log revision number (0) not 1 implies that >> no selective self-test has ever been run >> SMART Selective self-test log data structure revision number 0 >> Note: revision number not 1 implies that no selective self-test has >> ever been run >> SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS >> 1 0 0 Interrupted [50% left] (0-65535) >> 2 0 0 Not_testing >> 3 0 0 Not_testing >> 4 0 0 Not_testing >> 5 0 0 Not_testing >> Selective self-test flags (0x0): >> After scanning selected spans, do NOT read-scan remainder of disk. >> If Selective self-test is pending on power-up, resume after 0 minute delay. >> >> >> It has 6 pending sectors. Why are they not reallocated? Can I force >> this somehow? (a scrub did not reallocate them) Is this enough to >> replace the HDD? >> >> Thanks, >> Mathias >> > > Uh oh, I did another scrub, and here's the status: > > DEV EVENTS REALL PEND UNCORR CRC RAW ZONE END > sdb1 6158767 0 0 0 2 0 0 > sdc1 6158767 0 0 0 0 0 0 > sdd1 6158767 0 0 0 0 0 0 > sde1 6158767 0 0 0 0 0 1 > sdf1 6158768 0 0 0 0 47 6 > sdg1 6158767 0 0 0 0 0 0 > sdh1 6158767 0 8 1 0 341 3 > > > Personalities : [raid6] [raid5] [raid4] > md0 : active raid6 sdh1[6] sdg1[0] sdf1[5] sde1[7] sdd1[4] sdb1[1] sdc1[3] > 9751756800 blocks super 1.2 level 6, 64k chunk, algorithm 2 > [7/7] [UUUUUUU] > > unused devices: <none> > > > /dev/md0: > Version : 1.2 > Creation Time : Tue Oct 19 08:58:41 2010 > Raid Level : raid6 > Array Size : 9751756800 (9300.00 GiB 9985.80 GB) > Used Dev Size : 1950351360 (1860.00 GiB 1997.16 GB) > Raid Devices : 7 > Total Devices : 7 > Persistence : Superblock is persistent > > Update Time : Sun Jul 31 18:51:58 2011 > State : clean > Active Devices : 7 > Working Devices : 7 > Failed Devices : 0 > Spare Devices : 0 > > Layout : left-symmetric > Chunk Size : 64K > > Name : ion:0 (local to host ion) > UUID : e6595c64:b3ae90b3:f01133ac:3f402d20 > Events : 6158767 > > Number Major Minor RaidDevice State > 0 8 97 0 active sync /dev/sdg1 > 1 8 17 1 active sync /dev/sdb1 > 4 8 49 2 active sync /dev/sdd1 > 3 8 33 3 active sync /dev/sdc1 > 5 8 81 4 active sync /dev/sdf1 > 6 8 113 5 active sync /dev/sdh1 > 7 8 65 6 active sync /dev/sde1 > > smartctl 5.41 2011-06-09 r3365 [x86_64-linux-2.6.39-ck] (local build) > Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net > > === START OF INFORMATION SECTION === > Model Family: SAMSUNG SpinPoint F4 EG (AFT) > Device Model: SAMSUNG HD204UI > Serial Number: S2HGJ1RZ800850 > LU WWN Device Id: 5 0024e9 003f1ebc9 > Firmware Version: 1AQ10003 > User Capacity: 2,000,398,934,016 bytes [2.00 TB] > Sector Size: 512 bytes logical/physical > Device is: In smartctl database [for details use: -P show] > ATA Version is: 8 > ATA Standard is: ATA-8-ACS revision 6 > Local Time is: Sun Jul 31 18:51:59 2011 IST > > ==> WARNING: Using smartmontools or hdparm with this > drive may result in data loss due to a firmware bug. > ****** THIS DRIVE MAY OR MAY NOT BE AFFECTED! ****** > Buggy and fixed firmware report same version number! > See the following web pages for details: > http://www.samsung.com/global/business/hdd/faqView.do?b2b_bbs_msg_id=386 > http://sourceforge.net/apps/trac/smartmontools/wiki/SamsungF4EGBadBlocks > > SMART support is: Available - device has SMART capability. > SMART support is: Enabled > > === START OF READ SMART DATA SECTION === > SMART overall-health self-assessment test result: PASSED > > General SMART Values: > Offline data collection status: (0x80) Offline data collection activity > was never started. > Auto Offline Data Collection: Enabled. > Self-test execution status: ( 118) The previous self-test completed having > the read element of the test failed. > Total time to complete Offline > data collection: (20640) seconds. > Offline data collection > capabilities: (0x5b) SMART execute Offline immediate. > Auto Offline data collection on/off support. > Suspend Offline collection upon new > command. > Offline surface scan supported. > Self-test supported. > No Conveyance Self-test supported. > Selective Self-test supported. > SMART capabilities: (0x0003) Saves SMART data before entering > power-saving mode. > Supports SMART auto save timer. > Error logging capability: (0x01) Error logging supported. > General Purpose Logging supported. > Short self-test routine > recommended polling time: ( 2) minutes. > Extended self-test routine > recommended polling time: ( 255) minutes. > SCT capabilities: (0x003f) SCT Status supported. > SCT Error Recovery Control supported. > SCT Feature Control supported. > SCT Data Table supported. > > SMART Attributes Data Structure revision number: 16 > Vendor Specific SMART Attributes with Thresholds: > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE > UPDATED WHEN_FAILED RAW_VALUE > 1 Raw_Read_Error_Rate 0x002f 100 100 051 Pre-fail > Always - 341 > 2 Throughput_Performance 0x0026 055 053 000 Old_age > Always - 18989 > 3 Spin_Up_Time 0x0023 067 044 025 Pre-fail > Always - 10165 > 4 Start_Stop_Count 0x0032 100 100 000 Old_age > Always - 18 > 5 Reallocated_Sector_Ct 0x0033 252 252 010 Pre-fail > Always - 0 > 7 Seek_Error_Rate 0x002e 252 252 051 Old_age > Always - 0 > 8 Seek_Time_Performance 0x0024 252 252 015 Old_age > Offline - 0 > 9 Power_On_Hours 0x0032 100 100 000 Old_age > Always - 6452 > 10 Spin_Retry_Count 0x0032 252 252 051 Old_age > Always - 0 > 11 Calibration_Retry_Count 0x0032 252 252 000 Old_age > Always - 0 > 12 Power_Cycle_Count 0x0032 100 100 000 Old_age > Always - 20 > 181 Program_Fail_Cnt_Total 0x0022 100 100 000 Old_age > Always - 10121757 > 191 G-Sense_Error_Rate 0x0022 100 100 000 Old_age > Always - 1 > 192 Power-Off_Retract_Count 0x0022 252 252 000 Old_age > Always - 0 > 194 Temperature_Celsius 0x0002 064 057 000 Old_age > Always - 31 (Min/Max 16/43) > 195 Hardware_ECC_Recovered 0x003a 100 100 000 Old_age > Always - 0 > 196 Reallocated_Event_Count 0x0032 252 252 000 Old_age > Always - 0 > 197 Current_Pending_Sector 0x0032 100 100 000 Old_age > Always - 8 > 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age > Offline - 1 > 199 UDMA_CRC_Error_Count 0x0036 200 200 000 Old_age > Always - 0 > 200 Multi_Zone_Error_Rate 0x002a 100 100 000 Old_age > Always - 3 > 223 Load_Retry_Count 0x0032 252 252 000 Old_age > Always - 0 > 225 Load_Cycle_Count 0x0032 100 100 000 Old_age > Always - 21 > > SMART Error Log Version: 1 > No Errors Logged > > SMART Self-test log structure revision number 1 > Num Test_Description Status Remaining > LifeTime(hours) LBA_of_first_error > # 1 Extended offline Completed: read failure 60% 6452 > 1519520304 > # 2 Extended offline Interrupted (host reset) 50% 6408 - > # 3 Extended offline Completed without error 00% 6317 - > # 4 Extended offline Completed without error 00% 6260 - > # 5 Extended offline Completed without error 00% 6232 - > # 6 Extended offline Completed without error 00% 6170 - > # 7 Extended offline Completed without error 00% 6064 - > # 8 Extended offline Completed without error 00% 6029 - > # 9 Extended offline Completed without error 00% 5898 - > #10 Extended offline Aborted by host 60% 5893 - > #11 Extended offline Completed without error 00% 5728 - > #12 Extended offline Completed without error 00% 5706 - > #13 Extended offline Interrupted (host reset) 40% 5701 - > #14 Extended offline Interrupted (host reset) 90% 5666 - > #15 Extended offline Completed without error 00% 5560 - > #16 Extended offline Completed without error 00% 5527 - > #17 Extended offline Completed without error 00% 5392 - > #18 Extended offline Completed without error 00% 5357 - > #19 Extended offline Completed without error 00% 5250 - > #20 Extended offline Completed without error 00% 4272 - > #21 Extended offline Completed without error 00% 4017 - > > Note: selective self-test log revision number (0) not 1 implies that > no selective self-test has ever been run > SMART Selective self-test log data structure revision number 0 > Note: revision number not 1 implies that no selective self-test has > ever been run > SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS > 1 0 0 Completed_read_failure [60% left] (0-65535) > 2 0 0 Not_testing > 3 0 0 Not_testing > 4 0 0 Not_testing > 5 0 0 Not_testing > Selective self-test flags (0x0): > After scanning selected spans, do NOT read-scan remainder of disk. > If Selective self-test is pending on power-up, resume after 0 minute delay. > > :( I do have a bad HDD. RMA is already in progress, I need to take out > the drive and ship it to Samsung in Holland. Will print labels at work > on Tuesday. > > Questions: > > * How do I remove this HDD without causing damage to the array? Is > this the correct way?: > mdadm --manage /dev/md0 --fail /dev/sdh1 # fail the device > mdadm --manage /dev/md0 --remove /dev/sdh1 # remove the device > * (shut down the system gracefully) > * (remove the HDD) > * (install new HDD) > * (start system) > sfdisk -d /dev/sde | sfdisk /dev/sdh # partition the new HDD > mdadm --manage /dev/md0 --add /dev/sdh1 # add the partition to the array > > * After removing the HDD, should I do another scrub? > > Thanks a lot in advance! > > /Mathias > I think I need to hurry up: [13957.348692] ata10.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x6 frozen [13957.348704] ata10.00: failed command: READ FPDMA QUEUED [13957.348716] ata10.00: cmd 60/08:00:00:ab:5a/00:00:e1:00:00/40 tag 0 ncq 4096 in [13957.348719] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) [13957.348724] ata10.00: status: { DRDY } [13957.348730] ata10.00: failed command: WRITE FPDMA QUEUED [13957.348741] ata10.00: cmd 61/08:08:a8:e7:8d/00:00:2e:00:00/40 tag 1 ncq 4096 out [13957.348743] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [13957.348749] ata10.00: status: { DRDY } [13957.348759] ata10: hard resetting link [13962.835319] ata10: link is slow to respond, please be patient (ready=0) [13967.368679] ata10: SRST failed (errno=-16) [13967.368693] ata10: hard resetting link [13970.988699] ata10: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [13971.002394] ata10.00: configured for UDMA/133 [13971.002407] ata10.00: device reported invalid CHS sector 0 [13971.002413] ata10.00: device reported invalid CHS sector 0 [13971.002427] ata10: EH complete [14001.358848] ata10.00: exception Emask 0x0 SAct 0x2 SErr 0x0 action 0x6 frozen [14001.358862] ata10.00: failed command: READ FPDMA QUEUED [14001.358887] ata10.00: cmd 60/08:08:00:ab:5a/00:00:e1:00:00/40 tag 1 ncq 4096 in [14001.358890] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [14001.358898] ata10.00: status: { DRDY } [14001.358913] ata10: hard resetting link [14006.845324] ata10: link is slow to respond, please be patient (ready=0) [14011.378656] ata10: SRST failed (errno=-16) [14011.378669] ata10: hard resetting link [14016.865323] ata10: link is slow to respond, please be patient (ready=0) [14021.398640] ata10: SRST failed (errno=-16) [14021.398652] ata10: hard resetting link [14026.885310] ata10: link is slow to respond, please be patient (ready=0) [14029.925349] ata10: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [14029.939048] ata10.00: configured for UDMA/133 [14029.939061] ata10.00: device reported invalid CHS sector 0 [14029.939078] ata10: EH complete [14060.345358] ata10.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen [14060.345371] ata10.00: failed command: READ FPDMA QUEUED [14060.345384] ata10.00: cmd 60/08:00:00:ab:5a/00:00:e1:00:00/40 tag 0 ncq 4096 in [14060.345387] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) [14060.345394] ata10.00: status: { DRDY } [14060.345407] ata10: hard resetting link [14065.831985] ata10: link is slow to respond, please be patient (ready=0) [14070.365333] ata10: SRST failed (errno=-16) [14070.365345] ata10: hard resetting link [14074.625347] ata10: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [14074.639043] ata10.00: configured for UDMA/133 [14074.639056] ata10.00: device reported invalid CHS sector 0 [14074.639088] ata10: EH complete [14105.358687] ata10.00: NCQ disabled due to excessive errors [14105.358700] ata10.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen [14105.358712] ata10.00: failed command: READ FPDMA QUEUED [14105.358729] ata10.00: cmd 60/08:00:00:ab:5a/00:00:e1:00:00/40 tag 0 ncq 4096 in [14105.358732] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) [14105.358741] ata10.00: status: { DRDY } [14105.358754] ata10: hard resetting link [14110.845314] ata10: link is slow to respond, please be patient (ready=0) [14115.378674] ata10: SRST failed (errno=-16) [14115.378689] ata10: hard resetting link [14119.372023] ata10: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [14119.385704] ata10.00: configured for UDMA/133 [14119.385716] ata10.00: device reported invalid CHS sector 0 [14119.385743] ata10: EH complete [14121.527814] ata10.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 [14121.527824] ata10.00: edma_err_cause=00000084 pp_flags=00000001, dev error, EDMA self-disable [14121.527832] ata10.00: failed command: READ DMA EXT [14121.527844] ata10.00: cmd 25/00:08:00:ab:5a/00:00:e1:00:00/e0 tag 0 dma 4096 in [14121.527846] res 51/89:08:00:ab:5a/89:00:e1:00:00/e0 Emask 0x10 (ATA bus error) [14121.527852] ata10.00: status: { DRDY ERR } [14121.527857] ata10.00: error: { ICRC } [14121.527867] ata10: hard resetting link [14127.011973] ata10: link is slow to respond, please be patient (ready=0) [14131.545295] ata10: SRST failed (errno=-16) [14131.545307] ata10: hard resetting link [14137.031984] ata10: link is slow to respond, please be patient (ready=0) [14141.565306] ata10: SRST failed (errno=-16) [14141.565317] ata10: hard resetting link [14147.051993] ata10: link is slow to respond, please be patient (ready=0) [14161.032035] INFO: task jbd2/dm-0-8:613 blocked for more than 120 seconds. [14161.032044] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [14161.032050] jbd2/dm-0-8 D ffffffff81823020 0 613 2 0x00000000 [14161.032060] ffff8800c91e11a0 0000000000000046 0000000000000000 ffff8800cf0697c0 [14161.032070] ffff8800cf069780 ffff8800cf069780 ffff8800c93bdfd8 ffff8800c93bdfd8 [14161.032079] ffff8800c91e13d8 0000000000004000 ffff880094ed19c0 ffff8800c97e0688 [14161.032087] Call Trace: [14161.032106] [<ffffffff81204c67>] ? generic_make_request+0x2f7/0x570 [14161.032116] [<ffffffff81133200>] ? __wait_on_buffer+0x30/0x30 [14161.032124] [<ffffffff814262e7>] ? io_schedule+0x57/0x80 [14161.032131] [<ffffffff8113320a>] ? sleep_on_buffer+0xa/0x20 [14161.032137] [<ffffffff81426a2f>] ? __wait_on_bit+0x4f/0x80 [14161.032143] [<ffffffff81133200>] ? __wait_on_buffer+0x30/0x30 [14161.032150] [<ffffffff81426add>] ? out_of_line_wait_on_bit+0x7d/0xa0 [14161.032159] [<ffffffff81059300>] ? autoremove_wake_function+0x30/0x30 [14161.032168] [<ffffffff811b680e>] ? jbd2_journal_commit_transaction+0x155e/0x16f0 [14161.032176] [<ffffffff810592d0>] ? abort_exclusive_wait+0xb0/0xb0 [14161.032183] [<ffffffff8142968e>] ? apic_timer_interrupt+0xe/0x20 [14161.032191] [<ffffffff811baddd>] ? kjournald2+0xad/0x210 [14161.032198] [<ffffffff810592d0>] ? abort_exclusive_wait+0xb0/0xb0 [14161.032205] [<ffffffff811bad30>] ? commit_timeout+0x10/0x10 [14161.032212] [<ffffffff81058a0f>] ? kthread+0x7f/0x90 [14161.032219] [<ffffffff81429a54>] ? kernel_thread_helper+0x4/0x10 [14161.032226] [<ffffffff81058990>] ? kthread_worker_fn+0x180/0x180 [14161.032233] [<ffffffff81429a50>] ? gs_change+0xb/0xb [14161.032261] INFO: task squid:1659 blocked for more than 120 seconds. [14161.032264] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [14161.032269] squid D ffffffff81823020 0 1659 1657 0x00000000 [14161.032277] ffff8800c9179d60 0000000000000082 ffff8800c97b0688 ffffffff812bf074 [14161.032285] ffff880012b818c0 ffffffff81823020 ffff8800c22c5fd8 ffff8800c22c5fd8 [14161.032293] ffff8800c9179f98 0000000000004000 ffff8800c22c5fd8 0000000000000000 [14161.032301] Call Trace: :-/ Currently shutting down all daemons that access the filesystem on the array, in attempt to umount the fs. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html