On Tue, Aug 23, 2016 at 02:36:30AM +0200, Clemens Gruber wrote: > Hi, > > I am using an i.MX6Q embedded board, acting as a (ethernet) gadget with > RNDIS function, connected over an USB OTG cable to a PC. > Most of the time it works fine, but in some mysterious circumstances, > a kernel panic occurs, just after attaching the OTG cable, connecting it > to the other machine: > > [ 54.012989] Unable to handle kernel NULL pointer dereference at virtual address 00000020 > [ 54.021099] pgd = 80004000 > [ 54.023816] [00000020] *pgd=00000000 > [ 54.027422] Internal error: Oops: 817 [#1] PREEMPT SMP ARM > [ 54.032915] Modules linked in: > [ 54.035998] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc3-00017-g336bc4a #315 > [ 54.043662] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) > [ 54.050196] task: 80b05f80 task.stack: 80b00000 > [ 54.054744] PC is at isr_setup_status_phase+0x1c/0x40 > [ 54.059805] LR is at 0xbe570890 > [ 54.062957] pc : [<804ac464>] lr : [<be570890>] psr: 200e0193 > [ 54.062957] sp : 80b01e10 ip : be570570 fp : be570890 > [ 54.074442] r10: be5eeebc r9 : be570010 r8 : be5eeebc > [ 54.079673] r7 : be5708d0 r6 : be5eee80 r5 : be7fcf40 r4 : 00000001 > [ 54.086206] r3 : be571010 r2 : 804ab368 r1 : 00000000 r0 : be570010 > [ 54.092742] Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none > [ 54.099972] Control: 10c5387d Table: 4e34404a DAC: 00000051 > [ 54.105723] Process swapper/0 (pid: 0, stack limit = 0x80b00210) > (snip) > [ 54.247100] [<804ac464>] (isr_setup_status_phase) from [<804acbbc>] (isr_tr_complete_handler+0x734/0x98c) > [ 54.256680] [<804acbbc>] (isr_tr_complete_handler) from [<804acfc0>] (udc_irq+0x1ac/0x318) > [ 54.264964] [<804acfc0>] (udc_irq) from [<8018ba28>] (__handle_irq_event_percpu+0x9c/0x128) > [ 54.273330] [<8018ba28>] (__handle_irq_event_percpu) from [<8018bae0>] (handle_irq_event_percpu+0x2c/0x7c) > [ 54.282995] [<8018bae0>] (handle_irq_event_percpu) from [<8018bb68>] (handle_irq_event+0x38/0x5c) > [ 54.291880] [<8018bb68>] (handle_irq_event) from [<8018f2cc>] (handle_fasteoi_irq+0xd0/0x1bc) > [ 54.300418] [<8018f2cc>] (handle_fasteoi_irq) from [<8018afb0>] (generic_handle_irq+0x24/0x34) > [ 54.309042] [<8018afb0>] (generic_handle_irq) from [<8018b2dc>] (__handle_domain_irq+0x7c/0xec) > [ 54.317754] [<8018b2dc>] (__handle_domain_irq) from [<80101524>] (gic_handle_irq+0x38/0x74) > [ 54.326119] [<80101524>] (gic_handle_irq) from [<8010ccb0>] (__irq_svc+0x70/0xb0) > (snip) > > After looking through the isr_setup_status_phase disassembly, I found > that ci->status must have been NULL and dereferencing it in > ci->status->context = ci; triggered the panic. > > The interrupt was a USBINT (UI bit was set) and isr_tr_complete_handler > was called from udc_irq. > In the IMX6DQRM I read about the UI bit: "This bit is also set by the > Host/Device Controller when a short packet is detected." and about > USBERRINT / UEI bit: "This bit is set along with the USBINT bit, if the > TD on which the error interrupt occurred also had its interrupt on > complete (IOC) bit set." (page 5494) > > However, we do not check for UEI in udc_irq. > Could this be the cause of this error? UEI is an error interrupt, and software have not handled it, so it will not affect ci->status. > Should we only call isr_tr_complete_handler if UI && !UEI ? > > Or would adding a check for ci->status == NULL in isr_setup-status_phase > and returning an error code also be a good idea? I agree with that. > > Do you have an idea what's going on there and why ci->status is NULL? > I can't understand it, the only possible is the last disconnect event (see ci_udc_vbus_session->_gadget_stop_activity) has scheduled very late due to vbus lowers very slow. -- Best Regards, Peter Chen -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html