Thanks for your prompt reply Tejun. On Sat, Mar 29, 2008 at 7:20 PM, Tejun Heo <htejun@xxxxxxxxx> wrote: > Hello, > > Sagar Borikar wrote: > > transition as well. So I get messages like COMRESET Failed and hard > > reset failed. This doesn't happen if I insert back the drive > > immediately. The system immediately recovers. > > Okay, that was one long paragraph. :-) > > The behavior itself (sans triggering machine reset) is intended. libata > EH doesn't rely on the edge events (PHY status changed). It relies on > level state (PHY readiness) and as long as at least one PHY event is > triggered after link status has changed, it doesn't care what polarity > those events are or how many of them are. That was the design decision > made for robustness. I understand. But the issue is, if I insert another drive, that event gets detected. Only remove event is not getting detected. So I was wondering if somehow I am able to make the remove events detected, I can go ahead.Also digging further in the code. As expected in insert_remove action, the dev->class becomes ATA_UNKNOWN and hence the ata_eh_revalidate_and_attach function doesn't execute the following if condition "action & ATA_EH_REVALIDATE && ata_dev_ready" I am attaching two logs with remove and insert_remove files which indicates the flow of the sequence in these two paths. IF you could browse through that, would be great. > > ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen > > ata2: hard resetting port > > ata2: port is slow to respond, please be patient > > ata2: port failed to respond (30 secs) ---------------------> At this > > state, actually the drive is removed. But not detected. > > ata2: COMRESET failed (device not ready) > > ata2: hardreset failed, retrying in 5 secs > > ata2: hard resetting port > > ata2: SATA link down (SStatus 0 SControl 310) > > ata2: EH complete > > This is a quite old kernel, right? Recent ones take much shorter to > detect the condition. That's right. It is 2.6.18 > > PMON2000 MIPS Initializing. Standby... > > ERRORPC=bfc00004 CONFIG=0042e4bb STATUS=00400000 > > CPU PRID 000034c1, MaskID 00001320 > > Initializing caches...done (CONFIG=0042e4bb) > > Switching to runtime address map...done > > Setting up SDRAM controller: sdram config 0x80010000 > > master clock 100 Mhz, MulFundBIU 0x02, DivXSDRAM 0x02 > > sdram freq 0x09ef21aa hz, sdram period: 0x06 nsec > > dimm0: density 256Mbit, width 16, single-sided, unbuffered, size > > 0x08000000 > > supported CAS latency: 2.5 2, using 2.5 cycles, byte18=0x0c > > RAS to CAS delay (tRCD) 0x12 nsec, byte29=0x > > Okay, and the machine got reboot. It's weird that the reset happens > *after* EH is complete. After EH complete is printed, libata won't > touch the hardware. I'm sorry but I don't have any clue why the machine > is getting rebooted. Does the machine reset on oops? Also it happens after say 1 to 1.5 minutes. If I insert drive within this duration, reset doesn't happen. Also if I insert in any other slot, reset doesn't happen. Only after immediate removal of the disk, the reset happens. Is there any way by which I can make the insertion event edge triggered? Thanks in advance Sagar > -- > tejun >
[root@NAS00180001310e ~]# sil_host_intr:1 sil_host_intr:4 ata_port_freeze sil_freeze start sil_freeze end ata1 port frozen ata_bmdma_error_handler: start ata_do_eh : ata_eh_autopsy ata_eh_autopsy : start ata_do_eh : ata_eh_report ata1: exception Emask 0x10 SAct 0x0 SErr 0x10000 action 0x2 frozen ata_do_eh : ata_eh_recover ata_eh_detach_dev : start ata_eh_detach_dev : ata_eh_prep_resume ata_eh_detach_dev : ata_eh_skip_recovery sil_freeze start sil_freeze end ata1 port frozen ata_eh_detach_dev : ata_eh_reset ata_std_prereset: start ata1: hard resetting port ata_port_offline : port is offline sata_print_link_status:start ata1: SATA link down (SStatus 0 SControl 310) sata_print_link_status:end ata_eh_detach_dev : ata_eh_thaw_port ata_eh_detach_dev : ata_eh_revalidate_and_attach ata_port_offline : port is offline ata1: failed to recover some devices, retrying in 5 secs ata_eh_detach_dev : ata_eh_prep_resume ata_eh_detach_dev : ata_eh_skip_recovery sil_freeze start sil_freeze end ata1 port frozen ata_eh_detach_dev : ata_eh_reset ata_std_prereset: start ata1: hard resetting port lcdout: error response: ng ata_port_offline : port is offline sata_print_link_status:start ata1: SATA link down (SStatus 0 SControl 310) sata_print_link_status:end ata_eh_detach_dev : ata_eh_thaw_port ata_eh_detach_dev : ata_eh_revalidate_and_attach ata_port_offline : port is offline ata1: failed to recover some devices, retrying in 5 secs ata_eh_detach_dev : ata_eh_prep_resume ata_eh_detach_dev : ata_eh_skip_recovery sil_freeze start sil_freeze end ata1 port frozen ata_eh_detach_dev : ata_eh_reset ata_std_prereset: start ata1: hard resetting port ata_port_offline : port is offline sata_print_link_status:start ata1: SATA link down (SStatus 0 SControl 310) sata_print_link_status:end ata_eh_detach_dev : ata_eh_thaw_port ata_eh_detach_dev : ata_eh_revalidate_and_attach ata_port_offline : port is offline ata1.00: disabled ata_port_offline : port is offline ata_eh_detach_dev : start ata_eh_detach_dev : start ata_eh_detach_dev : ata_eh_prep_resume ata_eh_detach_dev : ata_eh_skip_recovery ata_eh_detach_dev : ata_eh_revalidate_and_attach ata_eh_detach_dev : ata_eh_resume ata_eh_detach_dev : ata_eh_suspend ata_do_eh : ata_eh_finish ata_eh_finish : start ata_bmdma_error_handler: end ata1: EH complete ata1.00: detaching (SCSI 0:0:0:0)
[root@NAS00180001310e ~]# Drive 1 inserted sil_host_intr:1 sil_host_intr:4 ata_port_freeze sil_freeze start sil_freeze end ata1 port frozen ata_bmdma_error_handler: start ata_do_eh : ata_eh_autopsy ata_eh_autopsy : start ata_do_eh : ata_eh_report ata1: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen ata_do_eh : ata_eh_recover ata_eh_detach_dev : start ata_eh_detach_dev : ata_eh_prep_resume ata_eh_detach_dev : ata_eh_skip_recovery sil_freeze start sil_freeze end ata1 port frozen ata_eh_detach_dev : ata_eh_reset ata_std_prereset: start ata1: hard resetting port ata_port_offline : port is offline sata_print_link_status:start ata1: SATA link down (SStatus 0 SControl 310) sata_print_link_status:end ata_eh_detach_dev : ata_eh_thaw_port ata_eh_detach_dev : ata_eh_revalidate_and_attach ata_eh_detach_dev : ata_eh_resume ata_eh_detach_dev : ata_eh_suspend ata_do_eh : ata_eh_finish ata_eh_finish : start ata_bmdma_error_handler: end ata1: EH complete