Re: some question about EXDEV status in period schedule

yoma sophian <sophian.yoma@xxxxxxxxx> · Sun, 24 Nov 2013 22:07:23 +0800

2013/11/19 Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>:
> Please use Reply-To-All so that your message gets sent to the mailing
> list as well as to me.
Sorry for forgetting that :)

>
> On Sat, 16 Nov 2013, yoma sophian wrote:
>
> hi alan:
>

>> My questions are:
>> 1. in usb driver,
>>       ehci-sched.c --> itd_complete ->
>>
>>    if (unlikely (t & ISO_ERRS)) {
>>     ......................  (case A)
>>    } else if (likely ((t & EHCI_ISOC_ACTIVE) == 0)) {
>>     ........................   (case b)
>>    } else {
>>      /* URB was too late */    (case c)
>>       urb->error_count++;
>>     }
>> EXDEV seems come from above comment  " /*URB was too late */"
>> >

>> >> if my conclusion is correct, why we said "urb was too late"?
>> >> Does it mean urb too late to send or too late to service?
>> >
>> > The comment means "too late to service".
>>
>> Does that mean
>> 1. iso complete irq happen but
>> 2. scan_isoc is too late to be called, such as delaying 30 ms, etc.
>>
>> if so, the EHCI_ISOC_ACTIVE will be 0 and should not run into this case.
>
> No, it means the URB was submitted too late for the hardware to carry
> out the transfers at their scheduled times.

Could I said that case mean after the urb is actually _was_ scheduled,
the hardware doesn't service it until timeout?

>
>> >> 2. In my case, the urbs are always submitted with URB_ISO_ASAP.
>> >
>> > URBs with ISO_ASAP are never too late to service.  That's part of the
>> > reason I suspect the device stopped sending data.
>
> But that was wrong.  So it looks like the EHCI hardware is
> malfunctioning.
>

>>
>> BTW, I have one question.
>> 1. for iso in case:
>> even iso in is too late to send, ehci driver still put those delay itd
>> to the period scheduling list.
>
> This never happens with URB_ISO_ASAP.  Instead, if an URB is too late
> to be sent, the driver reschedules it for a later time.
>
>> That mean in token will firing on the bus.
>
> No.  During the interval between when the URB _should_ have been
> scheduled and when it actually _was_ scheduled, no IN tokens will be
> sent.

>
> However, in your case it looks like the EHCI controller went crazy and
> didn't send any IN tokens even when it should have.
>
> Here's the first place in your log where a problem occurs:
>
> d8a80400 78915228 C Zi:1:003:2 0:8:111:0 10 0:0:100 0:512:100
> d8a80400 78923303 S Zi:1:003:2 -115:8:0 10 -18:0:512 -18:512:512
> d8a80800 78925390 C Zi:1:003:2 0:8:191:0 10 0:0:100 0:512:100
> d8a80800 78927016 S Zi:1:003:2 -115:8:0 10 -18:0:512 -18:512:512
> d8a80400 78941545 C Zi:1:003:2 0:8:319:10 10 -18:0:0 -18:512:0
>
> In the first two lines, URB d8a80400 completes at 78.915 but doesn't
> get resubmitted until 78.923, which is 8 ms later.  In the meantime,
> URB d8a80800 was running, and it completes at 78.925.
>
> Because there is only 2 ms remaining in the pipeline when d8a80400 is
> resubmitted, the URB_ISO_ASAP flag causes it to be delayed.  Normally

I am not quite understand what does "URB_ISO_ASAP flag causes it to be
delayed" mean.
Does that mean the 2 ms remaining is caused by "URB_ISO_ASAP"?

> it would have been scheduled to start in uframe 271, but instead it was
> scheduled to start in uframe 319, which was 6 ms later (I don't
> understand why this delay was so large; it shouldn't be more than 2
> ms).
Would you please let my know why you think it shouldn't be more than 2ms?

>This means that 6 schedule slots were left empty -- no IN tokens.
>
> On the last line, you can see where d8a80400 completed after the delay.
> This line is full of -18 errors, which means the hardware did not carry
> out the transfers even though they were added to the schedule well in
> advance.  In fact, the hardware did not carry out any more transfers
> until almost 3 seconds later!
>
>> if device response data, ehci driver will go into below a) or b)
>
> No.  If the host sends an IN token then the driver will go into case a
> or case b.  If the host receives valid data from the device, the driver
> will go into case b.
>
>> if device not response, scan_iso will be called by periodically
>> ehci_work timer and clean it.
>
> If the device does not send data, the driver will go into case a.

Why If the device does not send data, the driver will go into case a. ?
in case a, that mean host get any error from the data he recive.
If no data get from device, how host determine the error from?

> The only reason for the periodic timer to fire is if the hardware isn't
> generating interrupts properly.  If that happens, the driver will go
> into case c.
>
>> ehci driver will go in c) and -EXDEV will keep not modify.
>>
>> 1. for iso out case:
>> even iso out is too late to send, ehci driver still put those delay
>> itd to the period scheduling list.
>> That mean out token and out data will firing on the bus.
>
> No, OUT works the same way as IN, except that the device never sends
> anything.
>
>> if device response data, ehci driver will go into below a) or b)
>
> The device never sends anything.  If the hardware is working correctly,
> the driver will always go into case b.
>
>> since there is no ACK in iso, ehci driver will never go in c)
>
> The driver will go into case c if the hardware isn't working right.
Howabout below race condition happen for accidentlly run into case C.

Actually in there are bulk/iso interleaving happening on my system.
Suppose itds are submitted then ehci_work is triggered by bulk
interrupt and ehci->isoc_count >0.

The race condition may happen if hardware hasn't handled those itds, right?

BTW, why don't we just turn off period schedule in ehci_poll_PSS like below
(since we turn it on directly in ehci_poll_PSS, right)
	if (want == 0) {	/* Stopped */
		if (ehci->periodic_count > 0)
			ehci_set_command_bit(ehci, CMD_PSE);

	} else {		/* Running */
		if (ehci->periodic_count == 0) {

			/* Turn off the schedule after a while */
-			ehci_enable_event(ehci, EHCI_HRTIMER_DISABLE_PERIODIC,
					true);
+     	ehci_clear_command_bit(ehci, CMD_PSE);

		}
	}

>
> Have you tried running the same test using a regular PC, rather than an
> embedded system?
Yes, actually it is usermode usbdriver written with libusb.
And we are try to do the cross-check right now.

Appreciate your kind help,
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html