On Mon, Jun 8, 2015 at 6:14 PM, John Stultz <john.stultz@xxxxxxxxxx> wrote: > After setting up functionfs for adb w/ 4.1-rc7, I noticed some flakey behavior. > I enabled some lock debugging and got the following: > > [ 91.648093] read strings > [ 91.650264] g_ffs gadget: g_ffs ready > [ 91.652551] ci_hdrc ci_hdrc.0: CI_HDRC_CONTROLLER_RESET_EVENT received > [ 96.068693] BUG: spinlock lockup suspected on CPU#0, adbd/2791 > [ 96.068751] lock: 0xe7764880, .magic: e7764880, .owner: <none>/-1, > .owner_cpu: -407539900 > [ 96.073448] CPU: 0 PID: 2791 Comm: adbd Not tainted > 4.1.0-rc1-00032-g359b12f #147 > [ 96.081688] Hardware name: Qualcomm (Flattened Device Tree) > [ 96.089266] [<c0216ac8>] (unwind_backtrace) from [<c02136a8>] > (show_stack+0x10/0x14) > [ 96.094635] [<c02136a8>] (show_stack) from [<c075d9fc>] > (dump_stack+0x70/0xbc) > [ 96.102627] [<c075d9fc>] (dump_stack) from [<c026ef90>] > (do_raw_spin_lock+0x114/0x1a0) > [ 96.109661] [<c026ef90>] (do_raw_spin_lock) from [<c0764cb8>] > (_raw_spin_lock_irqsave+0x50/0x5c) > [ 96.117560] [<c0764cb8>] (_raw_spin_lock_irqsave) from [<c037c1a0>] > (kiocb_set_cancel_fn+0x1c/0x60) > [ 96.126519] [<c037c1a0>] (kiocb_set_cancel_fn) from [<c05ae568>] > (ffs_epfile_read_iter+0x8c/0x140) > [ 96.135289] [<c05ae568>] (ffs_epfile_read_iter) from [<c0332018>] > (__vfs_read+0xb0/0xd4) > [ 96.144290] [<c0332018>] (__vfs_read) from [<c0332ef8>] (vfs_read+0x7c/0x100) > [ 96.152535] [<c0332ef8>] (vfs_read) from [<c0332fbc>] (SyS_read+0x40/0x8c) > [ 96.159571] [<c0332fbc>] (SyS_read) from [<c020ff20>] > (ret_fast_syscall+0x0/0x4c) > [ 117.678633] INFO: rcu_preempt detected stalls on CPUs/tasks: > [ 117.683069] 0: (1 GPs behind) idle=805/140000000000000/0 > softirq=7187/7189 fqs=2601 > [ 117.683316] (detected by 3, t=2603 jiffies, g=3028, c=3027, q=474) > [ 117.692345] Task dump for CPU 0: > [ 117.697202] adbd R running 0 2791 1 0x00000002 > [ 117.704296] [<c075f234>] (__schedule) from [<ffffffff>] (0xffffffff) > > > It seems we're stuck on the kioctx.ctx_lock, and reviewing that lock > usage I don't see any problems in fs/aio.c right off. > > So I'm guessing the f_fs.c code is somehow not initializing the lock > structure, or maybe calling kiocb_set_cancel_fn() from the wrong > context? So looking further at this, it seems that the __vfs_read() calls new_sync_read(), which allocates a struct kiocb kiocb on the stack and passes it to the ffs_epfile_read_iter() funciton. That then calls kiocb_set_cancel_fn() passing a pointer to that kiocb. However, kiocb_set_cancel_fn assumes the kiocb is a sub-element of a struct aio_kiocb, and it tries to grab the kioctx from that parent structure. However it seems there is no aio_kiocb structure here, so the spin_lock_irqsave hangs trying to lock random data on the stack. I'm not super sure what the right fix is, but if do something like the following (sorry, whitespace corrupted via copy/paste), I don't seem to run into the problem. diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c index 3507f88..e322818 100644 --- a/drivers/usb/gadget/function/f_fs.c +++ b/drivers/usb/gadget/function/f_fs.c @@ -924,7 +924,8 @@ static ssize_t ffs_epfile_write_iter(struct kiocb *kiocb, struct iov_iter *from) kiocb->private = p; - kiocb_set_cancel_fn(kiocb, ffs_aio_cancel); + if (p->aio) + kiocb_set_cancel_fn(kiocb, ffs_aio_cancel); res = ffs_epfile_io(kiocb->ki_filp, p); if (res == -EIOCBQUEUED) @@ -968,7 +969,8 @@ static ssize_t ffs_epfile_read_iter(struct kiocb *kiocb, struct iov_iter *to) kiocb->private = p; - kiocb_set_cancel_fn(kiocb, ffs_aio_cancel); + if(p->aio) + kiocb_set_cancel_fn(kiocb, ffs_aio_cancel); res = ffs_epfile_io(kiocb->ki_filp, p); if (res == -EIOCBQUEUED) Is there a better solution here? I'm not sure I see if the is_sync_kiocb(kiocb) check would ever be false from here, so I'm not sure if all the p->aio checking is really needed or not. This issue seems to have been introduced with 70e60d917e91fff (gadget/function/f_fs.c: switch to ->{read,write}_iter()) thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html