On Fri, Apr 10, 2009 at 11:10 AM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > On Fri, 10 Apr 2009, Kung James wrote: > > I don't understand the purpose of the entire region of code under > discussion: > > /* special case for isoc transfers larger than 125us: > * the first and each subsequent fully used uframe > * must be empty, so as to not illegally delay > * already scheduled transactions > */ > if (125 < usecs) { > int ufs = (usecs / 125) - 1; > int i; > for (i = uframe; i < (uframe + ufs) && i < 8; i++) > if (0 < tt_usecs[i]) { > ehci_vdbg(ehci, > "multi-uframe xfer can't fit " > "in frame %d uframe %d\n", > frame, i); > return 0; > } > } > > The comment's explanation isn't clear enough. What does "illegally > delay already scheduled transactions" mean? What's wrong with pushing > back existing transactions to later in the frame? And why do transfers > longer than 125 us need special treatment? Neither the USB 2.0 spec > nor the EHCI spec mentions anything about this. Maybe "illegally" isn't the most appropriate adjective, but ignoring the per-uframe scheduling requirements kind of defeats the purpose of scheduling. If we don't care about delaying transfers past the uframe where their SSPLIT is scheduled, why not just schedule everything to SSPLIT in uframe 0? The problem isn't on the low/full-speed side, it's perfectly ok on that side to delay a transfer within a frame. The problem is on the high-speed side, of managing the TT. Scheduling uses the best case timing, so all the "fudges" built in are intended to handle errors, not make up for improper scheduling. The "correct" way to schedule a TT, as I understand it, is to ensure that in the best case, for each transfer, the transfer will actually happen in the same uframe as the SSPLIT is issued (same in H-frame and B-frame uframe number, EHCI spec sec 4.5 fig 4.7). Knowingly scheduling a multi-uframe isoc transfer that will force the TT to delay issuing another transfer for one or more uframes is not correct scheduling. After the uframe where a SSPLIT is scheduled, there are only 3 uframes where the TT will buffer the transfer until it's dropped; there are no CSPLITs scheduled after that anyway so there is no point in the TT keeping it. The tt_available function should (with the -1 removed) only schedule a multi-uframe isoc transfer starting in an empty uframe, with each following fully used uframe also empty. The last uframe, that is not fully used (assuming the transfer isn't an exact multiple of 125us), is handled normally by the scheduling calculations in periodic_tt_usecs, as are all transfers less than 125usec (1 uframe). Note that as TT scheduling only has to be done for periodic transfers, not control/bulk transfers, we are only talking about isoc or interrupt here. The max packet size restrictions of low/full-speed interrupt limit it to less than 1 uframe so a multi-uframe interrupt transfer is impossible (with "multi-uframe" meaning it takes more than 125usecs). To break down tt_available and its helper functions: For each frame where the transfer is executed (start at the specified frame and also check each frame + period*N), it does: Call periodic_tt_usecs. This fills the tt_usecs array with the current bandwidth usage of each uframe. It checks the current schedule, adding up each transfer scheduled on the same TT for that frame. It's a simple calculation, just adding the transfer's tt_usecs time to the uframe where it is scheduled to run. At the end, carryover_tt_bandwidth is called, which "carries over" any excess bandwidth for any uframe. So if uframe 1 has 200usec of transfers scheduled to start in that uframe, 75usec of that is "carried over" into uframe 2, and so on. So, the maximum amount of time scheduled for any single uframe is as defined by max_tt_usecs (i.e., 125usec for the most uframes, except 6 and 7 as those must be reserved for control/bulk). Once the array is filled with the current schedule times, the uframe where the new transfer is asking to be scheduled is checked. If it is already 100% scheduled, simply return failure. If the requested uframe isn't full, and the transfer is an isoc transfer that is larger than 1 uframe (125usec), then the start uframe, and each subsequent fully-used uframe must be empty. If not, then whatever other transfers were scheduled in those uframes would not execute in the uframe they were scheduled to execute in, which will very likely cause them to fail. After making sure each fully-used uframe is empty, the last partially-used uframe can be scheduled using the normal carryover scheduling. If the requested uframe isn't full, and the transfer isn't an isoc transfer larger than 1 uframe, then simple carryover scheduling is used. The requested uframe bandwidth time is increased by the new transfer's tt_usecs time, and carryover_tt_bandwidth is called again. As long as the schedule isn't pushed into the reserved time (for control/bulk transfers), the transfer is allowed to schedule itself in the specified frame/uframe with the specified period. >> But currently the "tt_available" function doesn't check >> the last fully occupied microframe. > > I don't see why it has to check any uframe at all... :-) See above. I don't know why I added the -1 in there, as I did not have it in the first patch I submitted. I added it in the second and final patch I submitted which was accepted. Unless I'm missing something, it was a bug to add it in, as the integer division gives exactly the number of fully used uframes, no modification is needed. >> My test case is described as follow: >> >> First, submit a intr. IN urb to a low-speed device (mouse), and then submit >> many isoc. OUT urbs with packet size 192 bytes to a full-speed device >> (audio device). The EHCI will put the isoc. OUT SSPLIT transaction >> at microframe (0,1), and put the intr. IN SSPLIT transaction at >> microframe (0), and the intr. IN CSPLIT transaction at microframe (1,2,3). >> >> Because the isoc. OUT packet's size is 192 bytes, the EHCI can execute >> 188 bytes isoc. OUT SSPLIT at most in microframe 0, and the rest 4 bytes >> will be executed in the microframe 1. We can observe that the EHCI need >> to execute an 188 bytes isoc. OUT SSPLIT and an intr. IN SSPLIT in >> microframe 0. This may makes the HUB have not enough time to execute >> the intr. IN SSPLIT in microframe 0. (The isoc. OUT SSPLIT will be executed >> first, because of the EHCI periodic frame list structure) >> >> So the schedule result is wrong, the EHCI driver can not put the two split >> transaction at the same microframe. This mistake is due to the "tt_available" >> function doest not check the last fully used uframe for the isoc. OUT split >> transaction. (The intr. IN SSPLIT is already scheduled at microframe 0 by >> EHCI driver) > > You mean that the scheduler should move the interrupt transaction back > later in the frame when the isoc transfer is added, but it fails to > do so? (I.e., it doesn't rebalance the schedule as described in > 4.12.2.5 of the EHCI spec?) Yes, that would be a bug. No, that isn't the problem, at least not with the specific code in question now. Not that rebalancing is bad. But, there is no rebalancing of TTs in the ehci driver right now, as far as I know. I certainly didn't add any with the updated TT scheduling. > In fact, does ehci_hcd ever rebalance the periodic schedule? If it > doesn't then the proposed fix to tt_available() is clearly inadequate. Rebalancing isn't the issue; also rebalancing the high-speed schedule is different than rebalancing any particular TT's schedule. > For example, suppose that before the isoc transfer was added there > already were three interrupt transfers, of length 64, 64, and 60, > scheduled for uframe 1 (thus using up the entire uframe). Then there > still wouldn't be enough room to add the isoc transfer, but the patched > code wouldn't know there was a problem. > > Or take another example. Suppose we already have an isoc transfer of > length 154 and an interrupt transfer of length 32 scheduled in uframe > 0. Is there enough room to add a new isoc transfer of length 40? Not > if the interrupt transfer can't be pushed back to uframe 1 -- but > tt_available wouldn't realize this since none of the transfers lasts > more than 125 us. Sure it would. That's done in periodic_tt_usecs via the carryover scheduling. It doesn't rebalance the TT's schedule, it just makes sure that the SSPLIT will execute in the right uframe. Can the simple carryover math be improved? Sure. In fact it is possible to delay transfers past the uframe where they were supposed to execute; for example if uframe 0 is 90% full, and uframe 1 is 90% full with multiple transfers; if a new transfer asked to be scheduled in uframe 0 with say 100usec, it would be allowed since most of it would carry over into uframe 1, and most of uframe 1 would carry over into uframe 2. The last of the multiple transfers in uframe 1 could very well get pushed out if there is no time in uframe 1 to start it, but since there is a 3-uframe "fudge" buffer mandated per USB 2.0 spec, this should be ok. The large isoc check is supposed to prevent anything from being delayed more than 1 uframe, but the current bug changes it so it could possibly be delayed up to 2 uframes, leaving only 1 left for any not-best-case delays. The code could be improved to correctly schedule for all cases, probably by re-checking each transfer in the schedule after adding the time for the new transfer request, or maybe by keeping track of the time of the last SSPLIT in each uframe so it can verify that no SSPLIT is pushed into the next uframe. This would also remove the special large isoc case. But, with the -1 bug fixed, there is a max of 1 uframe that any transfer can be pushed back from its SSPLIT, and that is in rare cases I believe. -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html