On 2011.06.20 at 14:31 +0200, Markus Trippelsdorf wrote: > On 2011.06.20 at 13:45 +0200, Michael Monnerie wrote: > > On Montag, 20. Juni 2011 Markus Trippelsdorf wrote: > > > Here are two more examples. The time when the hang occurs is marked > > > > Could it be that some sectors on the disk are not easy to read for the > > drive, and that it simply retries several times until it works again? > > SATA disks can show that behaviour. You could try with "dd" with > > seek/skip parameters so you read 1gb at once, then skip 1gb and read 1gb > > again etc, and compare the throughput over all 1gb areas. If there's one > > slower, that might be the problem. > > > > Maybe a check with "smartctl" could help, too. > > Thanks for the hint, Michael. I've just checked the SMART status on > both disks and the 4kb drive looks indeed suspicious: > > 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 8 > 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 8 > > The 512 byte drive appears to be fine. But I'm running the long > SMART self test on both of them right now and will report back > the result in a few hours. Hmm, both tests ran fine without any errors. And the two SMART attributes above are back to zero again (must have been a temporary firmware hiccup). As you can see in the data I've posted, the disk workload consists almost only of writes. And I don't think a disk retries writes several times. On the contrary a write to a bad sector should fix it, because the drive can then remap it safely. (Current_Pending_Sector would decrease and Reallocated_Sector_Ct would increase. But Reallocated_Sector_Ct is still 0 on both affected drives) And shouldn't I see these "hangs" in situations other than "rm -fr", if the disk drive would be responsible? -- Markus _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs