On Sun, Aug 21, 2022 at 05:40:23PM +0100, James Dutton wrote: > On Sun, 21 Aug 2022 at 17:36, James Dutton <james.dutton@xxxxxxxxx> wrote: > > > > On Sun, 21 Aug 2022 at 15:47, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > > > > > > > The reason being, I have a system that boots from a USB disk. > > > > Due to interference, the USB device disconnects for a second or two > > > > and then comes back, but Linux does not see it and I have to reboot > > > > Linux to recover. So, in this situation I wish Linux to be able to > > > > recover immediately, without needing a reboot. > > > > > > There is no way to do this. For example, consider all those failed > > > writes that you get error messages about. Once they have failed, the > > > system does not try to remember them in case there's a possibility of > > > trying them again later. They're just lost. > > I guess the solution would have to include a "retry in 1 second's > > time" type failure mode, instead of just lost. Maybe, in theory. In your case, I think a better solution would be to eliminate the interference that causes the transient disconnects to occur in the first place. USB isn't designed to operate reliably in an environment filled with that much noise. > > I.e. differentiate between the disk responding that the media failed, > > and the link being down to the disk so the write message could not be > > sent. > > For example, NFS waits around for the network to return, maybe we > > could add that functionality between a filesystem and usb storage. In theory it could be done. I suspect the overall benefit would not be very large; I have not heard lots of reports from other people facing the problem you have. Consider that neither Windows nor Mac OS-X does this. Also, doing this would lead to other problems. For instace, I'm sure some people want to know that a device has stopped working as soon as the problem begins; they would get upset if the system kept trying to reconnect for tens of seconds before finally deciding the device was gone for good. (Consider the way people have complained a lot over the years about NFS and its extremely long uninterruptible waits.) > As a side note, I have seen USB links failing. Normally just to > something like a keyboard or mouse, so it just comes back without the > user knowing anything was wrong. That's different. When the link to a USB mouse fails and then starts working again, the system doesn't think the mouse has recovered; it regards what happened as a new mouse being plugged in. (Same with keyboards.) The user doesn't notice anything because the system treats all mice the same. In fact, you can even plug in two mice at the same time (that is, without bothering to wait for the first one to fail) and the system will accept input from both of them interchangeably. > The problem is USB links to disks don't recover currently. Well, you have to admit that treating disks like mice -- considering all of them to be the same -- would not be a good strategy. :-) (On the other hand, sometimes two disks really do get treated as though they are the same. That's what happens in a RAID-1 (mirroring) setup. If you have mirrored USB disks, you can unplug one of them and the system will continue working. And when you plug it back it later, the system will repair it as necessary and then go on using it normally without your noticing. But obviously this isn't what you have in mind.) Alan Stern