Re: [patch 2.6.29] usb: ehci-sched.c: EHCI SITD scheduling bugfix (resend)

Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> · Tue, 14 Apr 2009 15:11:12 -0400 (EDT)

We seem to be approaching agreement.  I'm going to skip over a bunch of 
stuff from your emails as being no longer relevant, if you don't 
mind...

On Tue, 14 Apr 2009, Dan Streetman wrote:

> This would be rebalancing, which as far as I know isn't done for TT
> schedules in the current driver.  Not that rebalancing is bad, it
> would be great to add rebalancing to the TT scheduler.  But that isn't
> simple.

It certainly isn't.  But without it, you're somewhat hobbled.

> >> If the requested uframe isn't full, and the transfer is an isoc
> >> transfer that is larger than 1 uframe (125usec), then the start
> >> uframe, and each subsequent fully-used uframe must be empty.
> >
> > I still disagree about the requirement that the start uframe be empty.
> 
> I think we agree.  The first uframe doesn't have to be empty, but the
> scheduling is more complicated if it isn't.  The current code puts it
> starting in an empty uframe because that is easier.

In fact, I have learned that for an isoc-OUT transfer larger than 188
bytes, the first uframe _does_ have to be empty -- i.e., there can't
be any other SSPLITs scheduled for that uframe.  This is because the
transfer has to be broken up into a bunch of SSPLITs, of which all but
the last must be 188 bytes long and therefore occupy an entire uframe.

(The actual transaction on the full-speed bus may share a uframe with
the preceding transaction; that's not what I meant.)

> I believe that within a uframe, isoc does occur before interrupt.  Is
> there anywhere in the spec that says the TT must be scheduled the same
> way, by rebalancing all interrupts to happen after isoc?

I overstated the case; it's not true that isoc transfers must always 
occur before interrupt transfers on the full-speed bus.

What is true is this: If an isoc-IN transfer comes after an interrupt
transfer on the full/low-speed bus (i.e., if the isoc-IN SSPLIT is
scheduled in a later uframe than the interrupt SSPLIT), then the 
CSPLITs for the two transactions cannot occur in the same uframe.

For example, let's suppose an interrupt transfer is scheduled to start
in B-uframe 1 (its SSPLIT is sent in B-uframe 0).  Then its CSPLITs are
scheduled in B-uframes 2, 3, and 4.  Consequently a later isoc-IN
must not have any CSPLITs scheduled for those uframes, which implies
that its SSPLIT cannot occur any earlier than B-uframe 3.

Why?  This follows from the constraint (implicit in the USB spec and
stated explicitly in the EHCI spec) that in any uframe, CSPLITs must be
sent in the same order as their corresponding SSPLITs.  Since an siTD 
will always precede a QH in the periodic list's chain of pointers, the 
isoc CSPLIT would precede the interrupt CSPLIT if they both occurred in 
the same uframe -- and that would violate the constraint.

> I disagree that interrupt OUT transfers have to occur in a single
> uframe.  Where does it say that in the spec?

I expressed myself poorly.  What I meant was this: In the best-case
accounting of full/low-speed transactions used by the scheduler, an
interrupt-OUT transfer has to occur within a single uframe.  The actual
transactions on the bus don't have such a restriction.

What is the reason?  It's because a TT isn't required to buffer more
than 188 bytes per uframe.

For example, suppose we want to schedule a 150-byte isoc-OUT transfer
followed by a 64-byte interrupt-OUT.  In the best-case accounting, we
would of course fit the isoc-OUT entirely in uframe 0, with 38 bytes
remaining.  Suppose that we tried to account for the interrupt-OUT
transfer starting in uframe 0 as well, with the last 26 bytes spilling
over into uframe 1.  In order to make the TT start both transactions in
uframe 0, during the preceding uframe we would have to send the TT a
150-byte isoc-OUT SSPLIT and a 64-byte interrupt-OUT SSPLIT.  (It's
impossible to divide the interrupt transfer into a 38-byte SSPLIT
followed by a 26-byte SSPLIT in the next uframe -- interrupt transfers
can have only one SSPLIT.)

But this would overflow the TT's 188-byte buffer.  Consequently there
is no choice but to arrange the accounting so that the entire
interrupt-OUT transaction occurs within uframe 1.

Even if the 150-byte isoc transfer were IN instead of OUT, the TT would
still have to buffer the entire 150 + 64 bytes during the time between
the end of the isoc bus transaction and the start of the interrupt bus
transaction.  Maybe some TTs can do this; I wouldn't rely on it.

> Where in the spec does it say that TTs have to be rebalanced to have
> interrupt after isoc?

As I mentioned above, strictly speaking they don't -- but the schedule 
does need to be set up so that no isoc-OUT SSPLIT occurs in the two 
uframes following an interrupt SSPLIT, unless a frame boundary 
intervenes.

> And yes, large isoc in transfers can start in the middle of a uframe,
> but calculating the bandwidth for this is more complicated.  As I
> said, the carryover math can be improved; starting all multi-uframe
> transfers in an empty uframe is simply easier.

Okay, that's something for me to work on ... when there's time!

> > Which leaves only isoc OUT. Â Both the USB and EHCI specs say (although
> > you have to look pretty closely to see it) that if an isoc OUT SSPLIT
> > is scheduled for both uframes N and N+1, then in uframe N it must be
> > scheduled for 188 bytes. Â Ergo, it has to occupy the entire uframe.
> 
> Clearly; the TT can't do only part of an isoc transfer, then do
> another transfer, then go back to the first transfer.  It has to do
> the entire isoc transfer on the full-speed bus as a single transfer.
> It breaks that up on the high-speed side by uframes.

I don't follow your logic.  This has nothing to do with intermingling 
transfers.

In principle, if the first 88 bytes of a uframe were already accounted
for, then a 200-byte isoc-OUT could be scheduled using a 100-byte
SSPLIT followed by another 100-byte SSPLIT in the next uframe.  But the
spec doesn't allow this and the hardware doesn't support it.  The first
SSPLIT _has_ to be 188 bytes.

> > On the other hand, I still don't see how this patch would fix the bug
> > as originally reported.
> 
> I'm not sure what you are missing? ;-)  The carryover bandwidth logic
> prevents any transfer from happening (in best case) more than 1 uframe
> after its SSPLIT, with the precondition that no transfer is more than
> 1 uframe.  The special-case of transfers larger than 1 uframe is
> handled by scheduling them only in empty uframes (except the partially
> used uframe).  The bug of -1 misses the last fully-used uframe.
> Removing the -1 fixes this.
> 
> In fact it fixes the problem so it does exactly what you just said; it
> enforces scheduling multi-uframe transfers only starting in an empty
> uframe, with empty uframes for each of the following fully-used
> uframes, except the last partially-used uframe.  Currently, the code
> misses the last fully-used uframe because of the -1 bug.

I agree that the -1 fix is needed.  But I don't agree that the
resulting code is bug-free.  For example, suppose there's nothing but a
64-byte interrupt transfer in uframe 0, and you want to add a 150-byte
isoc-OUT.  How will the carryover logic either prevent this or force
the isoc-OUT to start no earlier than uframe 3?

Another issue -- all the calculations in tt_available() and 
carryover_tt_bandwith() are done in usec instead of bytes.  This 
renders them subject to roundoff errors.

Finally, one last question.  The specs (and the code) mention in
several places that with proper scheduling, more than 16 full-speed
transactions will never occur within a single uframe.  That doesn't
seem right to me.

For example, why can't 17 two-byte isoc transfers occur within 125
usec?  As I see it, the bandwidth required per isoc transfer is:

	token packet (1-byte sync plus 3-byte packet)
	data packet (1-byte sync, 1-byte PID, 2 bytes of data,
		2-byte CRC)

which is 10 bytes (plus a fraction for the inter-packet gaps).  17 of
these would occupy less than 187.5 byte times if there's no
bit-stuffing.  And if the transfers were 0 bytes long then there would
be enough time for 22 of them!

So what am I missing?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html