On Wed, 2014-07-16 at 10:44 +0200, Hans de Goede wrote: > Hi, > > On 07/16/2014 08:45 AM, Oliver Neukum wrote: > > On Tue, 2014-07-15 at 22:23 +0200, Julian Sikorski wrote: > >> [ 54.017997] sd 6:0:0:0: [sdb] Write cache: enabled, read cache: > >> enabled, doesn't support DPO or FUA > >> [ 54.018119] sd 6:0:0:0: uas_sense_old: urb length 26 disagrees with > >> IU sense data length 510, using 18 bytes of sense data > >> [ 85.060805] sd 6:0:0:0: uas_eh_abort_handler ffff8803120b7400 tag > >> 0, > >> inflight: IN > >> [ 85.061835] sd 6:0:0:0: abort completed > >> [ 85.062037] ------------[ cut here ]------------ > >> [ 85.062191] kernel BUG at block/blk-core.c:2511! > >> [ 85.062328] invalid opcode: 0000 [#1] SMP > >> [ 85.062573] Modules linked in: uas usb_storage > >> nf_conntrack_netbios_ns nf_conntrack_broadcast ccm ip6t_rpfilter > >> ip6t_REJECT xt_conntrack bnep bluetooth ebtable_nat ebtable_broute > >> bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 > >> nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security > >> ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 > >> nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle > >> iptable_security iptable_raw fuse iTCO_wdt iTCO_vendor_support > >> x86_pkg_temp_thermal coretemp snd_hda_codec_realtek kvm_intel kvm arc4 > >> snd_hda_codec_generic snd_hda_codec_hdmi crct10dif_pclmul iwldvm > >> crc32_pclmul crc32c_intel mac80211 sdhci_pci uvcvideo > >> videobuf2_vmalloc > >> videobuf2_memops videobuf2_core snd_hda_intel ghash_clmulni_intel > >> videodev iwlwifi snd_hda_controller > >> [ 85.066723] media snd_hda_codec sdhci cfg80211 snd_hwdep microcode > >> jme snd_seq snd_seq_device snd_pcm mii serio_raw mmc_core jmb38x_ms > >> rfkill memstick mei_me mei snd_timer shpchp lpc_ich i2c_i801 mfd_core > >> i2c_core snd soundcore nfsd auth_rpcgss nfs_acl lockd sunrpc > >> binfmt_misc > >> firewire_ohci firewire_core crc_itu_t wmi video > >> [ 85.069089] CPU: 0 PID: 0 Comm: swapper/0 Not tainted > >> 3.15.4-200.fc20.x86_64+debug #1 > >> [ 85.069297] Hardware name: CLEVO > >> P150HMx/P150HMx, BIOS 4.6.4 08/09/2011 > >> [ 85.069509] task: ffffffff81c184c0 ti: ffffffff81c00000 task.ti: > >> ffffffff81c00000 > >> [ 85.069716] RIP: 0010:[<ffffffff813986d0>] [<ffffffff813986d0>] > >> blk_finish_request+0xf0/0x100 > >> [ 85.069987] RSP: 0018:ffff880321e03da0 EFLAGS: 00010002 > >> [ 85.070122] RAX: 0000000000000286 RBX: ffff88031b73d730 RCX: > >> 0000000000000000 > >> [ 85.070261] RDX: ffff880321e0eb20 RSI: 00000000fffffffb RDI: > >> ffff88031b73d730 > >> [ 85.071768] RBP: ffff880321e03db0 R08: 0000000000000000 R09: > >> 0000000000000001 > >> [ 85.071907] R10: ffffffff81c184c0 R11: ffffffff81c18d80 R12: > >> 00000000fffffffb > >> [ 85.072046] R13: ffff8803190625b0 R14: 0000000000000286 R15: > >> ffff8803120b7400 > >> [ 85.072186] FS: 0000000000000000(0000) GS:ffff880321e00000(0000) > >> knlGS:0000000000000000 > >> [ 85.072395] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >> [ 85.072531] CR2: 00007f6c210209c0 CR3: 0000000001c11000 CR4: > >> 00000000000407f0 > >> [ 85.072669] Stack: > >> [ 85.072794] ffff88031b73d730 00000000fffffffb ffff880321e03de0 > >> ffffffff81398722 > >> [ 85.073205] ffff88031b73d730 0000000008000000 00000000fffffffb > >> ffff88031b73d730 > >> [ 85.073615] ffff880321e03df0 ffffffff81398750 ffff880321e03e58 > >> ffffffff815265c1 > >> [ 85.074024] Call Trace: > >> [ 85.074151] <IRQ> > >> [ 85.074210] > >> [ 85.074387] [<ffffffff81398722>] blk_end_bidi_request+0x42/0x60 > >> [ 85.074523] [<ffffffff81398750>] blk_end_request+0x10/0x20 > >> [ 85.074664] [<ffffffff815265c1>] scsi_io_completion+0x111/0x6a0 > >> [ 85.074803] [<ffffffff8151af03>] scsi_finish_command+0xb3/0x110 > >> [ 85.074943] [<ffffffff815263c7>] scsi_softirq_done+0x137/0x160 > >> [ 85.075082] [<ffffffff8139fe7b>] blk_done_softirq+0x9b/0xd0 > >> [ 85.075223] [<ffffffff8109caad>] __do_softirq+0x12d/0x480 > >> [ 85.075361] [<ffffffff8109d135>] irq_exit+0xc5/0xd0 > >> [ 85.075499] [<ffffffff817f0598>] do_IRQ+0x58/0xf0 > >> [ 85.075635] [<ffffffff817e4432>] common_interrupt+0x72/0x72 > >> [ 85.075769] <EOI> > >> [ 85.075827] > >> [ 85.076006] [<ffffffff8162e494>] ? cpuidle_enter_state+0x54/0xd0 > >> [ 85.076142] [<ffffffff8162e547>] cpuidle_enter+0x17/0x20 > >> [ 85.076280] [<ffffffff810eef4e>] cpu_startup_entry+0x3ee/0x770 > >> [ 85.076420] [<ffffffff817c914a>] rest_init+0x13a/0x140 > >> [ 85.076557] [<ffffffff817c9015>] ? rest_init+0x5/0x140 > >> [ 85.076695] [<ffffffff81f1c006>] start_kernel+0x48d/0x4ae > >> [ 85.076833] [<ffffffff81f1b982>] ? repair_env_string+0x5c/0x5c > >> [ 85.076972] [<ffffffff81f1b120>] ? early_idt_handlers+0x120/0x120 > >> [ 85.077112] [<ffffffff81f1b5ee>] x86_64_start_reservations > >> +0x2a/0x2c > >> [ 85.077252] [<ffffffff81f1b73d>] x86_64_start_kernel+0x14d/0x170 > >> [ 85.077388] Code: 89 de e8 84 18 00 00 e9 4b ff ff ff 83 7b 78 01 > >> 0f > >> 85 58 ff ff ff 48 8b 43 60 48 8d b8 78 02 00 00 e8 d5 69 e2 ff e9 43 > >> ff > >> ff ff <0f> 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 > >> [ 85.081015] RIP [<ffffffff813986d0>] blk_finish_request+0xf0/0x100 > >> [ 85.081206] RSP <ffff880321e03da0> > > > > It looks like UAS is calling scsi_done() twice. > > > > I may be dense, but it looks like uas_data_cmplt() will call > > uas_try_complete() with only the UNLINK_DATA_URBS flag protecting > > it from calling scsi_done() twice. As uas_unlink_data_urbs() uses > > usb_unlink_urb() we are racing withthe completion handler. > > I don't think that is possible. You're right that usb_unlink_urb() does not > guarantee that the completion handler has actually run when it returns, > yet we still need the unlock + relock + protection flag because > > "This request is asynchronous, however the HCD might call the ->complete > callback during unlink. Therefore when drivers call usb_unlink_urb, they > must not hold any locks ..." > > So yes we have this race where the uas_data_cmplt() and uas_stat_cmplt() > which calls uas_unlink_data_urbs() run at the same time, but uas_stat_cmplt() > has retaken the lock (through the re-locking in uas_unlink_data_urbs()) before > it does cmdinfo->state &= ~COMMAND_INFLIGHT, and likewise uas_data_cmplt > does cmdinfo->state &= ~DATA_XX_URB_INFLIGHT with the lock, and they both > call uas_try_complete with the lock held. > > So yes you're right that they both will call uas_try_complete, but only > one for one of them this will not trigger: > > if (cmdinfo->state & (COMMAND_INFLIGHT | > DATA_IN_URB_INFLIGHT | > DATA_OUT_URB_INFLIGHT | > )) > return -EBUSY; > > As only for one of them all flags will be cleared, even if the unlink > results in uas_data_cmplt() running after UNLINK_DATA_URBS has already > been cleared. > > So as far as I can see the problem you think is there is not there. > > If you've a specific sequence of events in mind which you believe will > be a problem please spell it out step by step, and then we'll see from > there. CPU A CPU B URB completed in hardware uas_stat_cmplt() uas_try_complete() scsi_done() uas_unlink_data_urbs UNLINK_DATA_URBS set usb_unlink_urb() [NOP] UNLINK_DATA_URBS unset uas_try_complete() [COMMAND_INFLIGHT not set] [DATA_IN_URB_INFLIGHT not set] scsi_done() Regards Oliver -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html