Re: uas regression in 3.15 with Lacie Rugged USB3 and NEC uPD720200

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 07/16/2014 11:42 AM, Oliver Neukum wrote:
> 
> On Wed, 2014-07-16 at 10:44 +0200, Hans de Goede wrote:
>> Hi,
>>
>> On 07/16/2014 08:45 AM, Oliver Neukum wrote:
>>> On Tue, 2014-07-15 at 22:23 +0200, Julian Sikorski wrote:
>>>> [   54.017997] sd 6:0:0:0: [sdb] Write cache: enabled, read cache:
>>>> enabled, doesn't support DPO or FUA
>>>> [   54.018119] sd 6:0:0:0: uas_sense_old: urb length 26 disagrees with
>>>> IU sense data length 510, using 18 bytes of sense data
>>>> [   85.060805] sd 6:0:0:0: uas_eh_abort_handler ffff8803120b7400 tag
>>>> 0,
>>>> inflight: IN
>>>> [   85.061835] sd 6:0:0:0: abort completed
>>>> [   85.062037] ------------[ cut here ]------------
>>>> [   85.062191] kernel BUG at block/blk-core.c:2511!
>>>> [   85.062328] invalid opcode: 0000 [#1] SMP
>>>> [   85.062573] Modules linked in: uas usb_storage
>>>> nf_conntrack_netbios_ns nf_conntrack_broadcast ccm ip6t_rpfilter
>>>> ip6t_REJECT xt_conntrack bnep bluetooth ebtable_nat ebtable_broute
>>>> bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6
>>>> nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security
>>>> ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4
>>>> nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle
>>>> iptable_security iptable_raw fuse iTCO_wdt iTCO_vendor_support
>>>> x86_pkg_temp_thermal coretemp snd_hda_codec_realtek kvm_intel kvm arc4
>>>> snd_hda_codec_generic snd_hda_codec_hdmi crct10dif_pclmul iwldvm
>>>> crc32_pclmul crc32c_intel mac80211 sdhci_pci uvcvideo
>>>> videobuf2_vmalloc
>>>> videobuf2_memops videobuf2_core snd_hda_intel ghash_clmulni_intel
>>>> videodev iwlwifi snd_hda_controller
>>>> [   85.066723]  media snd_hda_codec sdhci cfg80211 snd_hwdep microcode
>>>> jme snd_seq snd_seq_device snd_pcm mii serio_raw mmc_core jmb38x_ms
>>>> rfkill memstick mei_me mei snd_timer shpchp lpc_ich i2c_i801 mfd_core
>>>> i2c_core snd soundcore nfsd auth_rpcgss nfs_acl lockd sunrpc
>>>> binfmt_misc
>>>> firewire_ohci firewire_core crc_itu_t wmi video
>>>> [   85.069089] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
>>>> 3.15.4-200.fc20.x86_64+debug #1
>>>> [   85.069297] Hardware name: CLEVO
>>>> P150HMx/P150HMx, BIOS 4.6.4 08/09/2011
>>>> [   85.069509] task: ffffffff81c184c0 ti: ffffffff81c00000 task.ti:
>>>> ffffffff81c00000
>>>> [   85.069716] RIP: 0010:[<ffffffff813986d0>]  [<ffffffff813986d0>]
>>>> blk_finish_request+0xf0/0x100
>>>> [   85.069987] RSP: 0018:ffff880321e03da0  EFLAGS: 00010002
>>>> [   85.070122] RAX: 0000000000000286 RBX: ffff88031b73d730 RCX:
>>>> 0000000000000000
>>>> [   85.070261] RDX: ffff880321e0eb20 RSI: 00000000fffffffb RDI:
>>>> ffff88031b73d730
>>>> [   85.071768] RBP: ffff880321e03db0 R08: 0000000000000000 R09:
>>>> 0000000000000001
>>>> [   85.071907] R10: ffffffff81c184c0 R11: ffffffff81c18d80 R12:
>>>> 00000000fffffffb
>>>> [   85.072046] R13: ffff8803190625b0 R14: 0000000000000286 R15:
>>>> ffff8803120b7400
>>>> [   85.072186] FS:  0000000000000000(0000) GS:ffff880321e00000(0000)
>>>> knlGS:0000000000000000
>>>> [   85.072395] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [   85.072531] CR2: 00007f6c210209c0 CR3: 0000000001c11000 CR4:
>>>> 00000000000407f0
>>>> [   85.072669] Stack:
>>>> [   85.072794]  ffff88031b73d730 00000000fffffffb ffff880321e03de0
>>>> ffffffff81398722
>>>> [   85.073205]  ffff88031b73d730 0000000008000000 00000000fffffffb
>>>> ffff88031b73d730
>>>> [   85.073615]  ffff880321e03df0 ffffffff81398750 ffff880321e03e58
>>>> ffffffff815265c1
>>>> [   85.074024] Call Trace:
>>>> [   85.074151]  <IRQ>
>>>> [   85.074210]
>>>> [   85.074387]  [<ffffffff81398722>] blk_end_bidi_request+0x42/0x60
>>>> [   85.074523]  [<ffffffff81398750>] blk_end_request+0x10/0x20
>>>> [   85.074664]  [<ffffffff815265c1>] scsi_io_completion+0x111/0x6a0
>>>> [   85.074803]  [<ffffffff8151af03>] scsi_finish_command+0xb3/0x110
>>>> [   85.074943]  [<ffffffff815263c7>] scsi_softirq_done+0x137/0x160
>>>> [   85.075082]  [<ffffffff8139fe7b>] blk_done_softirq+0x9b/0xd0
>>>> [   85.075223]  [<ffffffff8109caad>] __do_softirq+0x12d/0x480
>>>> [   85.075361]  [<ffffffff8109d135>] irq_exit+0xc5/0xd0
>>>> [   85.075499]  [<ffffffff817f0598>] do_IRQ+0x58/0xf0
>>>> [   85.075635]  [<ffffffff817e4432>] common_interrupt+0x72/0x72
>>>> [   85.075769]  <EOI>
>>>> [   85.075827]
>>>> [   85.076006]  [<ffffffff8162e494>] ? cpuidle_enter_state+0x54/0xd0
>>>> [   85.076142]  [<ffffffff8162e547>] cpuidle_enter+0x17/0x20
>>>> [   85.076280]  [<ffffffff810eef4e>] cpu_startup_entry+0x3ee/0x770
>>>> [   85.076420]  [<ffffffff817c914a>] rest_init+0x13a/0x140
>>>> [   85.076557]  [<ffffffff817c9015>] ? rest_init+0x5/0x140
>>>> [   85.076695]  [<ffffffff81f1c006>] start_kernel+0x48d/0x4ae
>>>> [   85.076833]  [<ffffffff81f1b982>] ? repair_env_string+0x5c/0x5c
>>>> [   85.076972]  [<ffffffff81f1b120>] ? early_idt_handlers+0x120/0x120
>>>> [   85.077112]  [<ffffffff81f1b5ee>] x86_64_start_reservations
>>>> +0x2a/0x2c
>>>> [   85.077252]  [<ffffffff81f1b73d>] x86_64_start_kernel+0x14d/0x170
>>>> [   85.077388] Code: 89 de e8 84 18 00 00 e9 4b ff ff ff 83 7b 78 01
>>>> 0f
>>>> 85 58 ff ff ff 48 8b 43 60 48 8d b8 78 02 00 00 e8 d5 69 e2 ff e9 43
>>>> ff
>>>> ff ff <0f> 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90
>>>> [   85.081015] RIP  [<ffffffff813986d0>] blk_finish_request+0xf0/0x100
>>>> [   85.081206]  RSP <ffff880321e03da0>
>>>
>>> It looks like UAS is calling scsi_done() twice.
>>>
>>> I may be dense, but it looks like uas_data_cmplt() will call
>>> uas_try_complete() with only the UNLINK_DATA_URBS flag protecting
>>> it from calling scsi_done() twice. As uas_unlink_data_urbs() uses
>>> usb_unlink_urb() we are racing withthe completion handler.
>>
>> I don't think that is possible. You're right that usb_unlink_urb() does not
>> guarantee that the completion handler has actually run when it returns,
>> yet we still need the unlock + relock + protection flag because
>>
>> "This request is asynchronous, however the HCD might call the ->complete
>> callback during unlink. Therefore when drivers call usb_unlink_urb, they
>> must not hold any locks ..."
>>
>> So yes we have this race where the uas_data_cmplt() and uas_stat_cmplt()
>> which calls uas_unlink_data_urbs() run at the same time, but uas_stat_cmplt()
>> has retaken the lock (through the re-locking in uas_unlink_data_urbs()) before
>> it does cmdinfo->state &= ~COMMAND_INFLIGHT, and likewise uas_data_cmplt
>> does cmdinfo->state &= ~DATA_XX_URB_INFLIGHT with the lock, and they both
>> call uas_try_complete with the lock held.
>>
>> So yes you're right that they both will call uas_try_complete, but only
>> one for one of them this will not trigger:
>>
>>         if (cmdinfo->state & (COMMAND_INFLIGHT |
>>                               DATA_IN_URB_INFLIGHT |
>>                               DATA_OUT_URB_INFLIGHT |
>>                               	))
>>                 return -EBUSY;
>>
>> As only for one of them all flags will be cleared, even if the unlink
>> results in uas_data_cmplt() running after UNLINK_DATA_URBS has already
>> been cleared.
>>
>> So as far as I can see the problem you think is there is not there.
>>
>> If you've a specific sequence of events in mind which you believe will
>> be a problem please spell it out step by step, and then we'll see from
>> there.
> 
> CPU A					CPU B
> URB completed in hardware
> uas_stat_cmplt()
> uas_try_complete()

If there are data-urbs which still need to comlete,
then scsi_done won't get called.

> scsi_done()
> 				uas_unlink_data_urbs

How do we get here, from which urb-completion handler
does this path get called? It looks like a code
path from urb_stat_cmplt, but if CPU a has already
done that, then how will B also run it, that makes
no sense, the stat urb will complete only once.

> 				UNLINK_DATA_URBS set
> 				usb_unlink_urb() [NOP]
> 				UNLINK_DATA_URBS unset
> 				uas_try_complete()
> 				[COMMAND_INFLIGHT not set]
> 				[DATA_IN_URB_INFLIGHT not set]
> 				scsi_done()

Regards,

Hans

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux