On 2020/05/30 10:10, Alan Stern wrote: > On Sat, May 30, 2020 at 09:42:46AM +0900, Tetsuo Handa wrote: >> On 2020/05/30 5:41, Andrey Konovalov wrote: >>> On Thu, May 28, 2020 at 10:58 PM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > >>>> This sounds like a bug in the driver. What would it do if someone had a >>>> genuine (not emulated) but buggy USB device which didn't send the >>>> desired response? The only way to unblock the driver would be to unplug >>>> the device! That isn't acceptable behavior. >>> >>> OK, that's what I thought. >> >> I believe that this is not a bug in the driver but a problem of hardware >> failure. Unless this is high-availability code which is designed for safely >> failing over to other node, we don't need to care about hardware failure. > > Oh my! I can't even imagine what Linus would say if he saw that... :-( > > Have you heard of Bad USB? Of course, I've heard of that. Please show me as a patch first. > > The kernel most definitely does need to protect itself against > misbehaving hardware. Let's just leave it at that. If you don't > believe me, ask Greg KH. I've made many locations killable (in order to reduce damage caused by OOM condition). But I can't make locations killable where handling SIGKILL case is too difficult to implement. "struct file_operations"->flush() is called from filp_close() when there is something which has to be done before "struct file_operations"->release() is called. As far as I read this thread, what you are trying to do sounds like allow "not waiting for completion of wdm_out_callback()" with only 's/wait_event/wait_event_intrruptible/' in wdm_flush(). Then, please do remove wdm_flush() call itself. I'm not familiar with USB. But at least we would need to do something similar to commit d0bd587a80960d7b ("usermodehelper: implement UMH_KILLABLE") in addition to 's/wait_event/wait_event_intrruptible/' in wdm_flush(). > > I admit, causing a driver to hang isn't the worst thing a buggy device > can do. But the kernel is supposed to be able to cope with such things > gracefully. My understanding is that the "misbehaving hardware" in this bug report is not "USB device itself" but "CPU used for receiving request from that USB device and sending response to that USB device". I don't know how wdm_flush() can recover when the CPU which is supposed to unblock wait_event() is blocked inside that wait_event() itself. Unless you can safely omit wdm_flush() by doing something similar to commit d0bd587a80960d7b, this looks to me like a circular dependency which is impossible to solve. Therefore, again, please show me as a patch first.