We observed that all the processes on the blade are totally stuck, though only one process is actually accessing the disks, other processes are mostly CPU-consuming applications. On Tue, Jun 29, 2010 at 1:11 PM, Robert Hancock <hancockrwd@xxxxxxxxx> wrote: > On 06/28/2010 03:09 AM, Simon Li wrote: >> >> We don't expect kernel goes to temporary hung, any ideas for the fix? > > It depends what you mean by hanging. Usually if a drive is stuck trying to > read a bad sector it will take quite a while before it gives up, and during > that period all disk access is stalled. The kernel normally retries several > times as well. If a bunch of applications are needing to access the disk, > they'll appear to be hung up during this process. > >> >> >> On Mon, Jun 28, 2010 at 5:02 PM, Jeff Garzik<jeff@xxxxxxxxxx> wrote: >>> >>> On 06/28/2010 03:38 AM, Simon Li wrote: >>>> >>>> May 25 15:55:06 shctc-xq-ems22-me18 kernel: ata3: status=0x25 { >>>> DeviceFault CorrectedError Error } >>> >>>> May 25 15:59:59 shctc-xq-ems22-me18 kernel: ata3: status=0x25 { >>>> DeviceFault CorrectedError Error } >>> >>>> Jun 2 10:54:06 shctc-xm-ems21-me18 kernel: ata2: status=0x25 { >>>> DeviceFault CorrectedError Error } >>> >>>> Jun 2 10:54:08 shctc-xm-ems21-me18 kernel: ata2: status=0x25 { >>>> DeviceFault CorrectedError Error } >>> >>>> Jun 2 10:54:18 shctc-xm-ems21-me18 kernel: ata2: status=0x25 { >>>> DeviceFault CorrectedError Error } >>> >>> Your hardware is returning errors, which libata is dutifully reporting... >>> >>> Jeff >>> >>> >>> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-ide" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html