On Fri, 23 Sep 2011, Iain Barker wrote: > >> ehci_hcd 0000:00:1a.0: HC died; cleaning up > > Oddly enough, this error message has nothing to do with usb-storage. > > It refers to the periodic schedule, which is used for things like hubs, input devices, audio & video, etc. > > Thanks for the clarification. It definitely seems to coincide with > when the USB storage disconnects. But I guess that makes sense : the > hub is internal to the server, so the kb/mouse hang off it also. If > that hub (or the ehci-hcd itself) is reset then the usb-storage would > also be disconnected. > > > Include more of the context > After the USB storage disconnect, I do see the usb-storage device > reconnect almost immediately. But filesystems mounted at the time of > the disconnect become inaccessible; the USB storage device comes back > as a new sd entry. That could not have happened after the error you copied above. After that "HC died; cleaning up" message, the controller is completely unusable. You would have to unbind and rebind it to the ehci-hcd driver. > e.g. if it was sda before the reset then it will come back as sdb, > and the filesystems on sda are then no longer accessible. Eventually > ext3 gives up on the dead sda and panics. > > > Can you build a kernel with CONFIG_USB_DEBUG enabled and post the dmesg log from a failure? > > I will try. The issue is not easy to reproduce, but it does occur on > multiple servers so I'll try to collate more data. > > Unfortunately, we use the HP as an embedded server and that USB > storage is where the root filesystem and logs are written to, so when > it disconnects we also lose the kernel logs etc. But I'll set up a > serial console to try and capture the kernel kernel debug messages > remotely. > > For reference, we have about 30 of the HP DL120 G7 servers, of which > at least 4 have shown the USB disconnect problem. The same USB > storage modules are rock-solid on the previous generation HP DL120 G6 > servers. You have to wonder what changed... > Thought: I wonder if there is a way to leverage the USB suspend > functionality, so that the device reconnects back in as the existing > sda, instead of as a new device... that might be sufficient to mask > the filesystem problem even if it doesn't address why the hub is > being reset. I do not think this is possible. Certainly not for the sort of error you got above. But more information is needed to get a clearer picture of the problem. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html