On 15 Oct 2015, at 04:12, Kaukab, Yousaf <yousaf.kaukab@xxxxxxxxx> wrote: >> -----Original Message----- >> From: Paul Jones [mailto:p.jones@xxxxxxxxxx] >> Sent: Thursday, October 15, 2015 12:30 AM >> To: Alan Stern >> Cc: Kaukab, Yousaf; Felipe Balbi; Linux USB Mailing List >> Subject: Re: Crash in usb_f_mass_storage >> >> >> On 14 Oct 2015, at 15:37, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: >> >>> On Wed, 14 Oct 2015, Paul Jones wrote: >>> >>>> On 12 Oct 2015, at 14:16, Felipe Balbi <balbi@xxxxxx> wrote: >>>> >>>>> >>>>> Hi, >>>>> >>>>> Paul Jones <p.jones@xxxxxxxxxx> writes: >>>>>> On 10 Oct 2015, at 16:32, Paul Jones <p.jones@xxxxxxxxxx> wrote: >>>>>> >>>>>>> I came across the following kernel message on the latest 4.3-rc4 whilst >> performance testing on a USB3380 device connected to a Mac (10.9.5): >>>>>>> >>>>>>> [ 51.613838] WARNING: CPU: 2 PID: 0 at >> drivers/usb/gadget/function/f_mass_storage.c:346 fsg_setup+0x12a/0x170 >> [usb_f_mass_storage]() >>>>>>> [ 51.613838] Modules linked in: usb_f_mass_storage libcomposite >> configfs drbg ansi_cprng ctr ccm arc4 snd_hda_codec_hdmi iwlmvm i915 >> mac80211 snd_hda_codec_realtek snd_hda_codec_generic hid_generic >> intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp iwlwifi >> kvm_intel cfg80211 kvm drm_kms_helper drm snd_hda_intel snd_hda_codec >> btusb btrtl crct10dif_pclmul crc32_pclmul btbcm snd_hda_core >> ghash_clmulni_intel btintel bluetooth snd_hwdep e1000e aesni_intel >> aes_x86_64 lrw snd_pcm gf128mul glue_helper ablk_helper cryptd serio_raw >> alx mei_me lpc_ich usbhid mei snd_timer snd net2280 i2c_algo_bit ptp >> udc_core fb_sys_fops mdio syscopyarea pps_core sysfillrect soundcore >> sysimgblt i2c_hid hid video dw_dmac sdhci_acpi shpchp >> i2c_designware_platform dw_dmac_core spi_pxa2xx_platform sdhci 8250_dw >> i2c_designware_core acpi_pad lp mac_hid parport >>>>>>> [ 51.613858] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G W >> 4.3.0-rc4+ #4 >>>>>>> [ 51.613859] Hardware name: Gigabyte Technology Co., Ltd. H97N- >> WIFI/H97N-WIFI, BIOS F7 04/21/2015 >>>>>>> [ 51.613860] ffffffffa03e9e10 ffff88021eb03d70 ffffffff81393f5d >> 0000000000000000 >>>>>>> [ 51.613861] ffff88021eb03da8 ffffffff81075856 ffff880037be4400 >> ffff88020b3023c8 >>>>>>> [ 51.613862] ffff880037be4400 00000000ffffffa1 0000000000000000 >> ffff88021eb03db8 >>>>>>> [ 51.613863] Call Trace: >>>>>>> [ 51.613864] <IRQ> [<ffffffff81393f5d>] dump_stack+0x44/0x57 >>>>>>> [ 51.613867] [<ffffffff81075856>] warn_slowpath_common+0x86/0xc0 >>>>>>> [ 51.613868] [<ffffffff8107594a>] warn_slowpath_null+0x1a/0x20 >>>>>>> [ 51.613870] [<ffffffffa03e4c2a>] fsg_setup+0x12a/0x170 >> [usb_f_mass_storage] >>>>>>> [ 51.613872] [<ffffffffa036ebd3>] composite_setup+0x173/0x16b0 >> [libcomposite] >>>>>>> [ 51.613873] [<ffffffff810e41da>] ? ktime_get+0x3a/0x90 >>>>>>> [ 51.613874] [<ffffffff8104dac9>] ? lapic_next_deadline+0x29/0x30 >>>>>>> [ 51.613875] [<ffffffff810e41da>] ? ktime_get+0x3a/0x90 >>>>>>> [ 51.613877] [<ffffffffa00e8452>] net2280_irq+0xba2/0x10db >> [net2280] >>>>>>> [ 51.613879] [<ffffffff810cb8b9>] >> handle_irq_event_percpu+0x39/0x180 >>>>>>> [ 51.613880] [<ffffffff810cba45>] handle_irq_event+0x45/0x70 >>>>>>> [ 51.613881] [<ffffffff810ceab9>] handle_edge_irq+0x99/0x150 >>>>>>> [ 51.613883] [<ffffffff810190cd>] handle_irq+0x1d/0x30 >>>>>>> [ 51.613883] [<ffffffff8177ccbd>] do_IRQ+0x4d/0xd0 >>>>>>> [ 51.613885] [<ffffffff8177a947>] common_interrupt+0x87/0x87 >>>>>>> [ 51.613885] <EOI> [<ffffffff81631638>] ? >> cpuidle_enter_state+0xb8/0x220 >>>>>>> [ 51.613888] [<ffffffff816317d7>] cpuidle_enter+0x17/0x20 >>>>>>> [ 51.613889] [<ffffffff810b5b52>] call_cpuidle+0x32/0x60 >>>>>>> [ 51.613890] [<ffffffff816317b3>] ? cpuidle_select+0x13/0x20 >>>>>>> [ 51.613891] [<ffffffff810b5d9c>] cpu_startup_entry+0x21c/0x2e0 >>>>>>> [ 51.613891] [<ffffffff8104c4d4>] start_secondary+0x104/0x130 >>>>>>> [ 51.613892] ---[ end trace bda1c37ade46c57d ] >>>>>>> >>>>>>> I can also reliable reproduce this by connecting the USB3380 to a USB >> port on the same Linux machine. >>>>>>> In that case I also see an error: >>>>>>> net2280 <pci-id>: net2280_enable: error=-22 > net2280_enable is returning EINVAL in more than one places. Can you check which one is it? > We need better error reporting from this driver. > >>>>>>> >>>>>>> Perhaps unrelated, but there is also a message: >>>>>>> configfs-gadget gadget: common->fsg is NULL in fsg_setup at 511 >>>>>> The same crash happens in 4.2 as well but not in 4.1. >>>>> >>>>> care to run a git bisect and find offending commit ? >>>> Unfortunately I encountered many kernels that hung my machine during a >> git bisect, so I had to git bisect skip many a time. >>>> Here s the git bisect result (between v4.1 and v4.2): >>>> There are only 'skip'ped commits left to test. >>>> The first bad commit could be any of: >>>> 4117a60c8e4c8d5f9fc05578e359d09d0fdf9d07 >>>> 4ae82e5d23961515796d76850499db3866c5e73b >>>> We cannot bisect more! >>>> >>>> Oddly neither of those commits seem very relevant to the problem. Not >>>> sure if this helps >>> >>> Mian Yousaf Kaukab added a bunch of changes to the net2280 driver >>> between 4.1 and 4.2. Do: >>> >>> git log v4.1..v4.2 -- drivers/usb/gadget/udc/net2280.c >>> >>> One of them may be responsible. >> >> Commit 25d40ee8 (the very first change in net2280) gives me the above error. >> Commit af6e613bb (just before that change) still works. > > 25d40ee should be very easy to revert. if this patch is causing the problem, can you try reverting it? > >> I pretty much tried all versions with net2280 changes upto 11bece5e. >> All of them exhibit the above error. >> For 11bece5e I couldn’t test as that revision hangs my kernel during boot up. > > for net2280_enable: error=-22 I would suspect following patches: > > c65c4f0 usb: gadget: net2280: fix use of GPEP in both directions > e9ab4d0 usb: gadget: autoconf: net2280: match hardware and usb ep address > I added some debug messages into f_mass_storage/net2280 and what I am seeing is: 1) net2280 <pci-id>: INFO Defect 7374 workaround waited about 2) the host port detects the mass storage device 3) configfs-gadget: gadget super-speed config #1: c 4) DEBUG: common->fsg is set 5) DEBUG: net2280_enable: failed at line=258, error=-22, 0!=1 6) DEBUG: common->fsg is cleared 7) usb-storage <device path>: USB Mass Storage device detected 8) usbcore: registered new interface driver usb-storage 9) usbcore: registered new interface driver has 10) configfs-gadget gadget: common->fsg is NULL in fsg_setup at 512 The real problem is that the check whether the USB ep number matches the hardware ep number fails. In my case hardware = 0, desc = 1 Any ideas on how that could happen? Paul. -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html