On Mon, 27 Nov 2017 11:17:54 +0200, Felipe Balbi <balbi@xxxxxxxxxx> wrote: > as I said, the *only* thing that schedules from inside > dwc3_gadget_ep_dequeue() is wait_event_lock_irq(). Which works fine > unless usb_ep_dequeue() is called with locks held and IRQs disabled. This I understand, yes. What puzzles me is how to get it to be called with IRQs enabled. At least, if ep_lock can be left unaquired in ffs_aio_cancel without risking data inconsistencies when called without larger locks acquired (this I am still not certain of yet), there are still 3 more locks to go before dwc3_gadget_ep_dequeue could get called with IRQs enabled in such call stack: >>> [ 382.515903] #0: (rcu_callback){....}, at: [<c10b4ff0>] rcu_process_callbacks+0x260/0x440 Acquired with lock_acquire (via rcu_lock_acquire), which disables IRQs. >>> [ 382.524572] #1: (rcu_read_lock_sched){....}, at: [<c1358ba0>] percpu_ref_switch_to_atomic_rcu+0xb0/0x130 Ditto. >>> [ 382.534650] #2: (&(&ctx->ctx_lock)->rlock){....}, at: [<c11f0c73>] free_ioctx_users+0x23/0xd0 free_ioctx_users acquires ctx->ctx_lock using spin_lock_irq, so again disabling IRQs. > If my original suggestion didn't help, then there may be other bugs in > f_fs.c. So I believe something has to delay cancelling the AIO so it happens in another call stack, with IRQs enabled. I did the following, which is likely very naive. I believe it opens a race-condition on io_data->work (complete vs. cancel), and as I still do not understand what eps_lock is supposed to protect I may be opening something else there. I do not like that I cannot handle nor propagate any error returned by usb_ep_dequeue. But on the plus side, it did allow my program to get killed without causing an immediate panic: [ 73.396732] read descriptors [ 73.403524] read strings [ 92.407515] configfs-gadget gadget: high-speed config #1: c [ 104.090661] ffs_aio_cancel_worker: 0 io_submit(0xb79ed000, 2, [{aio_lio_opcode=IOCB_CMD_PREADV, aio_fildes=7, aio_buf=[{iov_base=0xb78ec008, iov_len=1048576}], aio_offset=0, aio_resfd=3}, {aio_lio_opcode=IOCB_CMD_PREADV, aio_fildes=7, aio_buf=[{iov_base=0xb77eb008, iov_len=1048576}], aio_offset=0, aio_resfd=3}]) = 2 epoll_wait(8, 0x12c4558, 1023, -1) = -1 EINTR (Interrupted system call) --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=1887, si_uid=1000} --- +++ killed by SIGTERM +++ Here is the naive patch (NOT for inclusion, but as an RFC): --- a/drivers/usb/gadget/function/f_fs.c +++ b/drivers/usb/gadget/function/f_fs.c @@ -1069,23 +1069,33 @@ ffs_epfile_open(struct inode *inode, struct file *file) return 0; } +static void ffs_aio_cancel_worker(struct work_struct *work) +{ + struct ffs_io_data *io_data = container_of(work, struct ffs_io_data, + work); + int value; + + ENTER(); + + value = usb_ep_dequeue(io_data->ep, io_data->req); + printk("ffs_aio_cancel_worker: %i", value); +} + static int ffs_aio_cancel(struct kiocb *kiocb) { struct ffs_io_data *io_data = kiocb->private; - struct ffs_epfile *epfile = kiocb->ki_filp->private_data; + struct ffs_data *ffs = io_data->ffs; int value; ENTER(); - spin_lock_irq(&epfile->ffs->eps_lock); - - if (likely(io_data && io_data->ep && io_data->req)) - value = usb_ep_dequeue(io_data->ep, io_data->req); - else + if (likely(io_data && io_data->ep && io_data->req)) { + INIT_WORK(&io_data->work, ffs_aio_cancel_worker); + queue_work(ffs->io_completion_wq, &io_data->work); + value = -EINPROGRESS; + } else value = -EINVAL; - spin_unlock_irq(&epfile->ffs->eps_lock); - return value; } Regards, -- Vincent Pelletier -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html