On Sat, May 30, 2020 at 09:42:46AM +0900, Tetsuo Handa wrote: > On 2020/05/30 5:41, Andrey Konovalov wrote: > > On Thu, May 28, 2020 at 10:58 PM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > >> > >> On Thu, May 28, 2020 at 09:51:35PM +0200, Andrey Konovalov wrote: > >>> On Thu, May 28, 2020 at 9:40 PM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > >>>> > >>>> On Thu, May 28, 2020 at 09:03:43PM +0200, Andrey Konovalov wrote: > >>>> > >>>>> Ah, so the problem is that when a process exits, it tries to close wdm > >>>>> fd first, which ends up calling wdm_flush(), which can't finish > >>>>> because the USB requests are not terminated before raw-gadget fd is > >>>>> closed, which is supposed to happen after wdm fd is closed. Is this > >>>>> correct? I wonder what will happen if a real device stays connected > >>>>> and ignores wdm requests. > >>>>> > >>>>> I don't understand though, how using wait_event_interruptible() will > >>>>> shadow anything here. > >>>>> > >>>>> Alan, Greg, is this acceptable behavior for a USB driver? > >>>> > >>>> I don't understand what the problem is. Can you explain in more general > >>>> terms -- nothing specific to wdm or anything like that -- what you are > >>>> concerned about? Is this something that could happen to any gadget > >>>> driver? Or any USB class device driver? Or does it only affect > >>>> usespace components of raw-gadget drivers? > >>> > >>> So, AFAIU, we have a driver whose flush() callback blocks on > >>> wait_event(), which can only terminate when either 1) the driver > >>> receives a particular USB response from the device or 2) the device > >>> disconnects. > >> > >> This sounds like a bug in the driver. What would it do if someone had a > >> genuine (not emulated) but buggy USB device which didn't send the > >> desired response? The only way to unblock the driver would be to unplug > >> the device! That isn't acceptable behavior. > > > > OK, that's what I thought. > > I believe that this is not a bug in the driver but a problem of hardware > failure. Unless this is high-availability code which is designed for safely > failing over to other node, we don't need to care about hardware failure. As Alan said, that's just not true. It's the job of an operating system kernel to handle all of the crazy ways hardware is broken, and make it work properly for people. We deal with hardware failure all the time. So don't do uninterruptable waits or loop for forever waiting for some hardware value to change that might not change. That's a sure way to lock up the system and make users mad at you. thanks, greg k-h