On Wed, Aug 18, 2021 at 10:07 PM Ferry Toth <fntoth@xxxxxxxxx> wrote: > Op 18-08-2021 om 00:00 schreef Ferry Toth: > > Op 16-08-2021 om 07:18 schreef Felipe Balbi: > >> Ferry Toth <fntoth@xxxxxxxxx> writes: > >>>> Ferry Toth <fntoth@xxxxxxxxx> writes: > >>>>>>>> Ferry Toth <fntoth@xxxxxxxxx> writes: > >>>>>>>>>>>>> Hardware name: Intel Corporation Merrifield/BODEGA BAY, > >>>>>>>>>>>>> BIOS 542 > >>>>>>>>>>>>> 2015.01.21:18.19.48 > >>>>>>>>>>>>> RIP: 0010:0x500000000 > >>>>>>>>>>>>> Code: Unable to access opcode bytes at RIP 0x4ffffffd6. > >>>>>>>>>>>>> RSP: 0018:ffffa4d00045fc28 EFLAGS: 00010046 > >>>>>>>>>>>>> RAX: 0000000500000000 RBX: ffff8cd546aed200 RCX: > >>>>>>>>>>>>> 0000000000000000 > >>>>>>>>>>>>> RDX: 0000000000000000 RSI: ffff8cd547bfcae0 RDI: > >>>>>>>>>>>>> ffff8cd546aed200 > >>>>>>>>>>>>> RBP: ffff8cd547bfcae0 R08: 0000000000000000 R09: > >>>>>>>>>>>>> 0000000000000001 > >>>>>>>>>>>>> R10: ffff8cd541fd28c0 R11: 0000000000000000 R12: > >>>>>>>>>>>>> ffff8cd547342828 > >>>>>>>>>>>>> R13: ffff8cd546aed248 R14: 0000000000000000 R15: > >>>>>>>>>>>>> ffff8cd548b1d000 > >>>>>>>>>>>>> FS: 0000000000000000(0000) GS:ffff8cd57e200000(0000) > >>>>>>>>>>>>> knlGS:0000000000000000 > >>>>>>>>>>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >>>>>>>>>>>>> CR2: 0000000500000000 CR3: 000000000311e000 CR4: > >>>>>>>>>>>>> 00000000001006f0 > >>>>>>>>>>>>> Call Trace: > >>>>>>>>>>>>> ? dwc3_remove_requests.constprop.0+0x14d/0x170 > >>>>>>>>>>>>> ? __dwc3_gadget_ep_disable+0x7a/0x160 > >>>>>>>>>>>>> ? dwc3_gadget_ep_disable+0x3d/0xd0 > >>>>>>>>>>>>> ? usb_ep_disable+0x1c/0x > >>>>>>>>>>>>> ? u_audio_stop_capture+0x79/0x120 [u_audio] > >>>>>>>>>>>>> ? afunc_set_alt+0x73/0x80 [usb_f_uac2] > >>>>>> So this is triggered by a SetInterface request... > >>>>>> > >>>>>>>>>>>>> ? composite_setup+0x224/0x1b90 [libcomposite] > >>>>>>>>>>>>> ? __dwc3_gadget_kick_transfer+0x160/0x400 > >>>>>>>>>>>>> ? dwc3_gadget_ep_queue+0xf3/0x1a0 > >>>>>>>>>>>>> ? configfs_composite_setup+0x6b/0x90 [libcomposite] > >>>>>>>>>>>>> ? configfs_composite_setup+0x6b/0x90 [libcomposite] > >>>>>>>>>>>>> ? dwc3_ep0_interrupt+0x459/0xa40 > >>>>>>>>>>>>> ? dwc3_thread_interrupt+0x8ee/0xf40 > >>>>>>>>>>>>> ? __schedule+0x235/0x6c0 > >>>>>>>>>>>>> ? disable_irq_nosync+0x10/0x10 > >>>>>>>>>>>>> ? irq_thread_fn+0x1b/0x60 > >>>>>>>>>>>>> ? irq_thread+0xc0/0x160 > >>>>>>>>>>>>> ? irq_thread_check_affinity+0x70/0x70 > >>>>>>>>>>>>> ? irq_forced_thread_fn+0x70/0x70 > >>>>>>>>>>>>> ? kthread+0x122/0x140 > >>>>>>>>>>>>> ? set_kthread_struct+0x40/0x40 > >>>>>>>>>>>>> ? ret_from_fork+0x22/0x30 > >>>>>>>>>>>> Do you mind enabling dwc3 traces and collecting them? > >>>>>>>>>>>> Trying to figure > >>>>>>>>>>>> out how we got here. > >>>>>>>>>>>> > >>>>>>>>>>> I'll try if I can get the same error by booting with USB in > >>>>>>>>>>> host mode > >>>>>>>>>>> and then switch to device mode. If so I can enable traces > >>>>>>>>>>> and collect as > >>>>>>>>>>> you explained me before. > >>>>>>>>>>> > >>>>>>>>>>> I'll try before monday, as then I fly for a holiday and will > >>>>>>>>>>> not be > >>>>>>>>>>> available before rc5. > >>>>>>>>>> you can enable all of those with kernel cmdline :-) > >>>>>>>>>> > >>>>>>>>>> https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> you need ftrace_dump_on_oops=1 and also need the correct > >>>>>>>>>> options on > >>>>>>>>>> trace_buf_size and trace_event. > >>>>>>>>>> > >>>>>>>>> On Edison-Arduino I have a switch to go to device mode, after > >>>>>>>>> which > >>>>>>>>> udev triggers a script configure gadgets through configfs. > >>>>>>>>> > >>>>>>>>> I tried to log following these instructions: > >>>>>>>>> > >>>>>>>>> https://www.kernel.org/doc/html/latest/driver-api/usb/dwc3.html#reporting-bugs > >>>>>>>>> <https://www.kernel.org/doc/html/latest/driver-api/usb/dwc3.html#reporting-bugs> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Unfortunately the kernel crashes so badly I can not get to the > >>>>>>>>> ` cp > >>>>>>>>> /t/trace /root/trace.txt` line (after a while the watchdog > >>>>>>>>> kicks). > >>>>>>>>> > >>>>>>>>> What to do next? > >>>>>>>> Pass ftrace_dump_on_oops to kernel cmdline. > >>>>>>>> > >>>>>>> No sure if I did this right, on oops everything is pushed to > >>>>>>> console > >>>>>>> (115k2 serial), I hope nothing essential is lost. > >>>>>>> > >>>>>>> I copied the screen buffer to file see attached. > >>>>>> Thank you, I bet it took quite a some time :-) Anyway, looking at > >>>>>> the logs around Set Interface requests, we can track every endpoint > >>>>>> that's disabled. I'll take a guess and assume we're failing at > >>>>>> the last > >>>>>> Set Interface, that means we should have something odd with > >>>>>> ep6in, but > >>>>>> everything looks fine in the trace output: > >>>>>> > >>>>>> [ 75.823107] irq/14-d-596 0d... 42789194us : > >>>>>> dwc3_gadget_ep_enable: ep6in: mps 192/346 streams 16 burst 0 ring > >>>>>> 0/0 flags E:swbp:< > >>>>>> [ 75.835472] irq/14-d-596 0d... 42789198us : > >>>>>> dwc3_alloc_request: ep6in: req 0000000002c71409 length 0/0 zsI ==> 0 > >>>>>> [ 75.846416] irq/14-d-596 0d... 42789202us : > >>>>>> dwc3_ep_queue: ep6in: req 0000000002c71409 length 0/192 zsI ==> -115 > >>>>>> [ 75.857360] irq/14-d-596 0d... 42789204us : > >>>>>> dwc3_alloc_request: ep6in: req 00000000a324f5d0 length 0/0 zsI ==> 0 > >>>>>> [ 75.868301] irq/14-d-596 0d... 42789206us : > >>>>>> dwc3_ep_queue: ep6in: req 00000000a324f5d0 length 0/192 zsI ==> -115 > >>>>>> [ 75.879244] irq/14-d-596 0d... 42789209us : dwc3_event: > >>>>>> event (000020c2): ep0in: Transfer Not Ready [0] (Not Active) > >>>>>> [Status Phase] > >>>>>> [ 75.891880] irq/14-d-596 0d... 42789211us : > >>>>>> dwc3_prepare_trb: ep0in: trb 000000004c0ae319 (E0:D0) buf > >>>>>> 000000001bded000 size 0 ctrl 00000c33 (HLcs:SC:status2) > >>>>>> [ 75.989131] irq/14-d-596 0d... 42789224us : > >>>>>> dwc3_gadget_ep_cmd: ep0in: cmd 'Start Transfer' [406] params > >>>>>> 00000000 1bded000 00000000 --> status: Successful > >>>>>> [ 76.096261] irq/14-d-596 0d... 42789272us : dwc3_event: > >>>>>> event (0000c042): ep0in: Transfer Complete (sIL) [Status Phase] > >>>>>> [ 76.107834] irq/14-d-596 0d... 42789275us : > >>>>>> dwc3_complete_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf > >>>>>> 000000001bded000 size 0 ctrl 00000c32 (hLcs:SC:status2) > >>>>>> [ 76.122944] irq/14-d-596 0d... 42789277us : > >>>>>> dwc3_gadget_giveback: ep0out: req 00000000cb1bd3cd length 0/0 zsI > >>>>>> ==> 0 > >>>>>> [ 76.134160] irq/14-d-596 0d... 42789280us : > >>>>>> dwc3_prepare_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf > >>>>>> 000000001bded000 size 8 ctrl 00000c23 (HLcs:SC:setup) > >>>>>> [ 76.231322] irq/14-d-596 0d... 42789292us : > >>>>>> dwc3_gadget_ep_cmd: ep0out: cmd 'Start Transfer' [406] params > >>>>>> 00000000 1bded000 00000000 --> status: Successful > >>>>>> [ 76.297418] kworker/-23 0d... 42789670us : > >>>>>> dwc3_ep_queue: ep3in: req 0000000029586135 length 0/96 ZsI ==> -115 > >>>>>> [ 76.308278] kworker/-23 0d... 42789695us : > >>>>>> dwc3_prepare_trb: ep3in: trb 00000000b81213d6 (E1:D0) buf > >>>>>> 0000000003b7a800 size 96 ctrl 00000811 (Hlcs:sC:normal) > >>>>>> [ 76.395294] kworker/-23 0d... 42789707us : > >>>>>> dwc3_gadget_ep_cmd: ep3in: cmd 'Update Transfer' [60007] params > >>>>>> 00000000 00000000 00000000 --> status: Successful > >>>>>> [ 76.471900] irq/14-d-596 0d... 42789842us : dwc3_event: > >>>>>> event (0000c040): ep0out: Transfer Complete (sIL) [Setup Phase] > >>>>>> [ 76.489308] irq/14-d-596 0d... 42789845us : > >>>>>> dwc3_ctrl_req: Set Interface(Intf = 5, Alt.Setting = 0) > >>>>>> [ 76.505650] irq/14-d-596 0d... 42789851us : > >>>>>> dwc3_ep_dequeue: ep6in: req 0000000002c71409 length 0/192 zsI ==> > >>>>>> -115 > >>>>>> [ 76.523315] irq/14-d-596 0d... 42789854us : > >>>>>> dwc3_gadget_giveback: ep6in: req 0000000002c71409 length 0/192 > >>>>>> zsI ==> -104 > >>>>>> [ 76.541427] irq/14-d-596 0d... 42789857us : > >>>>>> dwc3_free_request: ep6in: req 0000000002c71409 length 0/192 zsI > >>>>>> ==> -104 > >>>>>> [ 76.559267] irq/14-d-596 0d... 42789859us : > >>>>>> dwc3_ep_dequeue: ep6in: req 00000000a324f5d0 length 0/192 zsI ==> > >>>>>> -115 > >>>>>> [ 76.576937] irq/14-d-596 0d... 42789861us : > >>>>>> dwc3_gadget_giveback: ep6in: req 00000000a324f5d0 length 0/192 > >>>>>> zsI ==> -104 > >>>>>> [ 76.595046] irq/14-d-596 0d... 42789862us : > >>>>>> dwc3_free_request: ep6in: req 00000000a324f5d0 length 0/192 zsI > >>>>>> ==> -104 > >>>>>> [ 76.612892] irq/14-d-596 0d... 42789865us : > >>>>>> dwc3_gadget_ep_disable: ep6in: mps 192/346 streams 16 burst 0 > >>>>>> ring 0/0 flags E:swbp:< > >>>>>> [ 76.665535] irq/14-d-596 0d... 42789873us : dwc3_event: > >>>>>> event (000020c2): ep0in: Transfer Not Ready [0] (Not Active) > >>>>>> [Status Phase] > >>>>>> [ 76.684716] irq/14-d-596 0d... 42789875us : > >>>>>> dwc3_prepare_trb: ep0in: trb 000000004c0ae319 (E0:D0) buf > >>>>>> 000000001bded000 size 0 ctrl 00000c33 (HLcs:SC:status2) > >>>>>> [ 76.819195] irq/14-d-596 0d... 42789886us : > >>>>>> dwc3_gadget_ep_cmd: ep0in: cmd 'Start Transfer' [406] params > >>>>>> 00000000 1bded000 00000000 --> status: Successful > >>>>>> [ 76.926324] irq/14-d-596 0d... 42789930us : dwc3_event: > >>>>>> event (0000c042): ep0in: Transfer Complete (sIL) [Status Phase] > >>>>>> [ 76.937892] irq/14-d-596 0d... 42789933us : > >>>>>> dwc3_complete_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf > >>>>>> 000000001bded000 size 0 ctrl 00000c32 (hLcs:SC:status2) > >>>>>> [ 76.953003] irq/14-d-596 0d... 42789935us : > >>>>>> dwc3_gadget_giveback: ep0out: req 00000000cb1bd3cd length 0/0 zsI > >>>>>> ==> 0 > >>>>>> [ 76.964217] irq/14-d-596 0d... 42789938us : > >>>>>> dwc3_prepare_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf > >>>>>> 000000001bded000 size 8 ctrl 00000c23 (HLcs:SC:setup) > >>>>>> [ 77.061379] irq/14-d-596 0d... 42789950us : > >>>>>> dwc3_gadget_ep_cmd: ep0out: cmd 'Start Transfer' [406] params > >>>>>> 00000000 1bded000 00000000 --> status: Successful > >>>>>> [ 77.168595] irq/14-d-596 0d... 42790509us : dwc3_event: > >>>>>> event (0000c040): ep0out: Transfer Complete (sIL) [Setup Phase] > >>>>>> [ 77.180159] irq/14-d-596 0d... 42790512us : > >>>>>> dwc3_ctrl_req: Get String Descriptor(Index = 18, Length = 255) > >>>>>> [ 77.190578] irq/14-d-596 0d... 42790537us : > >>>>>> dwc3_prepare_trb: ep0in: trb 000000004c0ae319 (E0:D0) buf > >>>>>> 0000000003b68000 size 36 ctrl 00000c53 (HLcs:SC:data) > >>>>>> [ 77.287648] irq/14-d-596 0d... 42790550us : > >>>>>> dwc3_gadget_ep_cmd: ep0in: cmd 'Start Transfer' [406] params > >>>>>> 00000000 1bded000 00000000 --> status: Successful > >>>>>> [ 77.333107] irq/14-d-596 0d... 42790557us : dwc3_event: > >>>>>> event (000010c2): ep0in: Transfer Not Ready [0] (Not Active) > >>>>>> [Data Phase] > >>>>>> [ 77.407223] irq/14-d-596 0d... 42790575us : dwc3_event: > >>>>>> event (000090c2): ep0in: Transfer Not Ready [0] (Active) [Data > >>>>>> Phase] > >>>>>> [ 77.480985] irq/14-d-596 0d... 42790588us : dwc3_event: > >>>>>> event (0000c042): ep0in: Transfer Complete (sIL) [Data Phase] > >>>>>> [ 77.492376] irq/14-d-596 0d... 42790590us : > >>>>>> dwc3_complete_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf > >>>>>> 0000000003b68000 size 0 ctrl 00000c52 (hLcs:SC:data) > >>>>>> [ 77.507221] irq/14-d-596 0d... 42790595us : > >>>>>> dwc3_gadget_giveback: ep0out: req 00000000cb1bd3cd length 36/36 > >>>>>> ZsI ==> 0 > >>>>>> [ 77.518609] irq/14-d-596 0d... 42790597us : dwc3_event: > >>>>>> event (000020c0): ep0out: Transfer Not Ready [0] (Not Active) > >>>>>> [Status Phase] > >>>>>> [ 77.531332] irq/14-d-596 0d... 42790598us : > >>>>>> dwc3_prepare_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf > >>>>>> 000000001bded000 size 0 ctrl 00000c43 (HLcs:SC:status3) > >>>>>> [ 77.628669] irq/14-d-596 0d... 42790609us : > >>>>>> dwc3_gadget_ep_cmd: ep0out: cmd 'Start Transfer' [406] params > >>>>>> 00000000 1bded000 00000000 --> status: Successful > >>>>>> > >>>>>> Do you mind adding a few prints in dwc3_remove_requests to tell > >>>>>> us which > >>>>>> endpoint is being processed? Then we'll know for sure which one > >>>>>> caused > >>>>>> the crash. > >>>>>> > >>>>> I wouldn't mind but am leaving on a holiday, won't have time until > >>>>> 6 aug. > >>>> not a problem, we'll still be here when you're back :-) > >>> Well, let's go then :-) > >>> > >>> To get back in the mood I have retested 5.13.0, 5.14.0-rc1, 5.14.0-rc2 > >>> and 5.14.0-rc5. > >>> > >>> I find that 5.13.0 works fine, and the issue starts from 5.14.0-rc1. > >> That's great finding. We have a bisection point. There are a total of > >> 13764 commits between v5.13 and v5.14-rc1 > >> > >> $ git rev-list --count v5.13..v5.14-rc1 > >> 13764 > >> > >> git bisect should find the offending commit in at most 14 tries. That's > >> not too bad. > > I correctly guesstimated that the problem got introduced by the usb > > merge 79160a60 > > > > "Merge tag 'usb-5.14-rc1' of > > git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb" > > > > 116 commits(7 bisects). > > > > 24f779dac8f3efb9629adc0e486914d93dc45517 is the first bad commit > > > > "usb: gadget: f_uac2/u_audio: add feedback endpoint support" > > > > Ruslan's 3 patches are related to each other so I reverted all three > > 24f779da...e89bb428 and applied the reverts to rc1. > > > > I can confirm this indeed resolves the problem in rc1. > > > > Is late now, tomorrow evening I will apply the reverts to rc6. > > With these reverts rc6 works fine as well. > > So, where do we go from here? Since it's rc6 to rc7 I think the best is to send a revert series to handle the regression. > >>> With 5.14.0-rc5 the problem seems worse (or different?), and just > >>> disabling uac2 gadget does not prevent the crash. Even disabling gser > >>> and mass_storage.usb0 as well there is still a crash. > >>> > >>> Now I'm not sure how to proceed. Bisect rc1? Or focus on rc5 (rc6?)? > >> I'd first bisect between 5.13 and v5.14-rc1. Once you find the offending > >> commit, verify if reverting that on -rc1 works, then verify if reverting > >> on -rc5 also works :-) -- With Best Regards, Andy Shevchenko