On Thu, May 28, 2020 at 6:03 PM Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote: > > On 2020/05/29 0:18, Andrey Konovalov wrote: > >> I might have found what is wrong. > >> > >> My understanding is that a process using /dev/raw-gadget is responsible for > >> reacting to every USB request. I don't know whether /dev/raw-gadget already > >> provides callback for aborting the in-flight USB requests (in order to resume > >> wdm_flush()) when /dev/raw-gadget is closed (due to explicit close() syscall or > >> implicit exit_files() from do_exit() upon SIGKILL). I assume /dev/raw-gadget > >> already provides such callback in the following paragraphs. > > > > raw-gadget should kill all unfishished USB requests when the file is closed. > > I see. But > > > > >> > >> Since the reproducer is opening both /dev/raw-gadget (which is char-10-62) and > >> /dev/cdc-wdm0 (which is char-180-0), it seems that the kernel is falling into > >> deadlock condition due to the need to close both files when the reproducer is > >> killed. My guess is that since that process is stuck at wdm_flush() (due to > >> explicit close() syscall or implicit exit_files() from do_exit() upon SIGKILL), > >> that process cannot react to USB requests which are needed for resuming wdm_flush(). > >> Unexpectedly blocking a process which is responsible for reacting to USB requests > >> will look as if it is a broken hardware. > > > > Hm, so wdm_flush() is unable to finish unless an expected USB request > > is received from the device? This is a bug in the wdm driver then. > > this specific bug report is caused by being unable to close /dev/cdc-wdm0 > due to /dev/raw-gadget API usage bug in the userspace process. In other words, > this bug report should be closed with "#syz invalid" like a bug report at > https://syzkaller.appspot.com/bug?id=287aa8708bc940d0ca1645223c53dd4c2d203be6 > which unexpectedly did ioctl(FIFREEZE) without corresponding ioctl(FITHAW). > > > Should we use wait_event_interruptible() instead of wait_event() in > > wdm_flush()? > > That only shadows this kind of bug reports, by not using TASK_UNINTERRUPTIBLE. > > The problem that the userspace process which is responsible for closing > /dev/raw-gadget gets stuck at wdm_flush() unless interrupted by a signal > when closing /dev/cdc-wdm0 is remaining. I think that a process should not > open /dev/raw-gadget and /dev/cdc-wdm0 at the same time. Ah, so the problem is that when a process exits, it tries to close wdm fd first, which ends up calling wdm_flush(), which can't finish because the USB requests are not terminated before raw-gadget fd is closed, which is supposed to happen after wdm fd is closed. Is this correct? I wonder what will happen if a real device stays connected and ignores wdm requests. I don't understand though, how using wait_event_interruptible() will shadow anything here. Alan, Greg, is this acceptable behavior for a USB driver?