RE: [PATCH RESEND v5 0/3] ISOC supporting for xHCI

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Sarah Sharp [mailto:sarah.a.sharp@xxxxxxxxxxxxxxx]
> Hi Andiry,
> 
> I've been able to grab the oops message out after turning off a lot of
> debugging so that I could use netconsole.  The modifications I made to
> turn off
> debugging are on my amd-isoc branch.  This time the oops was triggered
> with just a HS webcam and a FS webcam plugged into a HS hub.
> 

Hi Sarah,

When did the oops triggered? Did it occur when you start the
application, when the application is running, or when you quit the
application? Can it be reproduced every time? I'm using a HS tv-tuner, a
FS webcam plugged into a HS hub and trying to reproduce it.

> The dmesg is attached.  The first thing I notice is that you're giving
> back URB
> ffff88012fe9f000 twice, any idea how that could happen?
> 

It seems the URB is given back twice successfully, which is strange.
when the first time usb_hcd_giveback_urb() called, urb->hcpriv is set to
NULL, and it will fail the second time as NULL pointer dereference when
using urb_priv. 
As I can see, does URB ffff88012fe9f000 more like a resubmitted URB or
duplicate print? The issue occurs with URB ffff88012fe9e800.
 
> When I ran the trimmed oops message through scripts/markup_oops.pl,
got
> the following output:
> 
> No vmlinux specified, assuming /lib/modules/2.6.34-rc7/build/vmlinux
>  		break;
>  	case TRB_TYPE(TRB_TRANSFER):
>  		ret = handle_tx_event(xhci, &event->trans_event);
>  		if (ret < 0)
>  			xhci->error_bitmask |= 1 << 9;
>  ffffffffa030179e:	80 cc 02             	or     $0x2,%ah
>  ffffffffa03017a1:	89 83 c0 09 00 00    	mov    %eax,0x9c0(%rbx)
|
> %ebx = ffff88012e146220
>  ffffffffa03017a7:	b8 01 00 00 00       	mov    $0x1,%eax   |
%eax
> => ffff88012fe9e800
>  ffffffffa03017ac:	e9 af f3 ff ff       	jmpq   ffffffffa0300b60
> <xhci_handle_event+0xa0>
>  ffffffffa03017b1:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)   |
%eax
> = ffff88012fe9e800
>  		int skip_td = 0;
>  		union xhci_trb *cur_trb;
>  		struct xhci_segment *cur_seg;
> 
>  		urb_priv = td->urb->hcpriv;
>  		idx = urb_priv->td_cnt;
>  ffffffffa03017b8:	48 8b 50 08          	mov    0x8(%rax),%rdx |
> %eax = ffff88012fe9e800  %edx => 0
> *ffffffffa03017bc:	8b 4a 04             	mov    0x4(%rdx),%ecx |
> %edx = 0  %ecx = ffff8800b7dc7680 <--- faulting instruction
>  		status = 0;
> 
>  		if (ep->skip) {
>  ffffffffa03017bf:	4b 8d 54 ad 00       	lea
> 0x0(%r13,%r13,4),%rdx
>  ffffffffa03017c4:	48 8d 14 92          	lea
(%rdx,%rdx,4),%rdx
>  ffffffffa03017c8:	41 80 bc d7 e0 00 00 	cmpb
> $0x0,0xe0(%r15,%rdx,8)
>  ffffffffa03017cf:	00 00
>  ffffffffa03017d1:	0f 85 bf 00 00 00    	jne    ffffffffa0301896
> <xhci_handle_event+0xdd6>
>  						 -EREMOTEIO;
>  			else
>  				td->urb->iso_frame_desc[idx].status = 0;
>  		} else {
>  			/* handle completion code */
>  			switch (trb_comp_code) {
>  ffffffffa03017d7:	83 7d cc 1f          	cmpl   $0x1f,-0x34(%rbp)
>  ffffffffa03017db:	0f 86 e4 01 00 00    	jbe    ffffffffa03019c5
> <xhci_handle_event+0xf05>
>  			case COMP_STOP:
> 
> So it looks like the urb_priv might be NULL.  If you're giving back
URBs
> that have already been given back, then that could be the cause.
>

I wonder why this would happen. It seems the urb of the td has already
been given back, so the urb_priv is NULL. But a URB can be given back
only when urb_priv->td_cnt is equal to urb_priv->length.
Urb_priv->td_cnt is increased only in handle_tx_event() when the td is
normally processed, or in xhci_giveback_urb_in_irq(), when the td is
cancelled or killed. As I can see, the two cases will never occur with
the same td. Do I miss something?

The other thing I notice is the skip flag of the ep is set but not
clear. That's not right. The skip flag should be cleared when the
corresponding td is found or when the ep td_list is empty. And there is
a Event Ring Full Error and a Missed Service Error in event ring. The
xHC missed so many tds, from td at 6a2c4ab0 to td at 6e1b4810, as I
count more than 80 tds missed. Is that normal? 

It's easy to add a urb_priv check that's not the root cause. I will try
to figure it out. Any suggestions are greatly appreciated.

Thanks,
Andiry

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux