Doug, Thanks for your detail debug information, pls add my Reviewed-by for this patch. Thanks, - Kever On 02/03/2016 06:47 AM, Doug Anderson wrote: > Kever, > > On Mon, Feb 1, 2016 at 11:46 PM, Kever Yang <kever.yang at rock-chips.com> wrote: >> Doug, >> >> >> On 01/29/2016 10:20 AM, Douglas Anderson wrote: >>> When setting up ISO and INT transfers dwc2 needs to specify whether the >>> transfer is for an even or an odd frame (or microframe if the controller >>> is running in high speed mode). >>> >>> The controller appears to use this as a simple way to figure out if a >>> transfer should happen right away (in the current microframe) or should >>> happen at the start of the next microframe. Said another way: >>> >>> - If you set "odd" and the current frame number is odd it appears that >>> the controller will try to transfer right away. Same thing if you set >>> "even" and the current frame number is even. >>> - If the oddness you set and the oddness of the frame number are >>> _different_, the transfer will be delayed until the frame number >>> changes. >>> >>> As I understand it, the above technique allows you to plan ahead of time >>> where possible by always working on the next frame. ...but it still >>> allows you to properly respond immediately to things that happened in >>> the previous frame. >>> >>> The old dwc2_hc_set_even_odd_frame() didn't really handle this concept. >>> It always looked at the frame number and setup the transfer to happen in >>> the next frame. In some cases that meant that certain transactions >>> would be transferred in the wrong frame. >>> >>> We'll try our best to set the even / odd to do the transfer in the >>> scheduled frame. If that fails then we'll do an ugly "schedule ASAP". >>> We'll also modify the scheduler code to handle this and not try to >>> schedule a second transfer for the same frame. >>> >>> Note that this change relies on the work to redo the microframe >>> scheduler. It can work atop ("usb: dwc2: host: Manage frame nums better >>> in scheduler") but it works even better after ("usb: dwc2: host: Totally >>> redo the microframe scheduler"). >>> >>> With this change my stressful USB test (USB webcam + USB audio + >>> keyboards) has less audio crackling than before. >> Seems this really help for your case? > Yes, I believe it does. Of course my test case is pretty "black box" > for the most part in that I play music on youtube while having a > webcam open and several USB input devices connected. I then try to > decide whether I hear more static or less static. ...clearly a less > subjective test would be better... > > * I tried with http://crosreview.com/325451 (see below) and I hear > more static with "use_old = true" than with "use_old = "false". > > * I tried with this entire patch reverted and I hear about the same > static as with "use_old = true". > > Note that counting reported MISS lines from my logging also shows that > the new code is better... > > >> Do you check if the transfer can happen right in the current frame? I know >> it's >> quite difficult to check it, but this changes what I know for the dwc core >> schedule the transaction. > Yes. I just tried again, too. I coded up > <https://chromium-review.googlesource.com/325451> and included it. I > then opened up a USB webcam. > > With things set to the old way: > > 115.355370 QH=dc6ba8c0 next(0) fn=10cb, sch=10ca=>10cb (+1) miss=0 > 115.355373 QH=dc6ba8c0 IMM ready fn=10cb, nxt=10cb > 115.355518 QH=dc6ba8c0 next(0) fn=10cc, sch=10cb=>10cc (+1) miss=0 > 115.355522 QH=dc6ba8c0 IMM ready fn=10cc, nxt=10cc > 115.355637 QH=dc6ba8c0 next(0) fn=10cd, sch=10cc=>10cd (+1) miss=0 > 115.355641 QH=dc6ba8c0 IMM ready fn=10cd, nxt=10cd > 115.355857 QH=dc6ba8c0 next(0) fn=10ce, sch=10cd=>10ce (+1) miss=0 > 115.355859 QH=dc6ba8c0 IMM ready fn=10ce, nxt=10ce > 115.355867 QH=dc6ba8c0, wire=10cf, old_wire=10d0, EO diff (use OLD) > 115.355870 QH=dc6ba8c0 EO MISS w/ old (10ce != 10cf) > 115.356037 QH=dc6ba8c0 next(0) fn=10d0, sch=10cf=>10d0 (+1) miss=1 MISS > 115.356039 QH=dc6ba8c0 IMM ready fn=10d0, nxt=10d0 > 115.356169 QH=dc6ba8c0 next(0) fn=10d1, sch=10d0=>10d1 (+1) miss=0 > 115.356170 QH=dc6ba8c0 IMM ready fn=10d1, nxt=10d1 > 115.356269 QH=dc6ba8c0 next(0) fn=10d2, sch=10d1=>10d2 (+1) miss=0 > 115.356273 QH=dc6ba8c0 IMM ready fn=10d2, nxt=10d2 > 115.356404 QH=dc6ba8c0 next(0) fn=10d3, sch=10d2=>10d3 (+1) miss=0 > 115.356407 QH=dc6ba8c0 IMM ready fn=10d3, nxt=10d3 > > With the new way: > > 87.814741 QH=e2fd7880 next(0) fn=32e4, sch=32e3=>32e4 (+1) miss=0 > 87.814744 QH=e2fd7880 IMM ready fn=32e4, nxt=32e4 > 87.814858 QH=e2fd7880 next(0) fn=32e5, sch=32e4=>32e5 (+1) miss=0 > 87.814862 QH=e2fd7880 IMM ready fn=32e5, nxt=32e5 > 87.815010 QH=e2fd7880 next(0) fn=32e6, sch=32e5=>32e6 (+1) miss=0 > 87.815012 QH=e2fd7880 IMM ready fn=32e6, nxt=32e6 > 87.815220 QH=e2fd7880 next(0) fn=32e8, sch=32e6=>32e7 (+1) miss=0 > 87.815222 QH=e2fd7880 IMM ready fn=32e8, nxt=32e7 > 87.815230 QH=e2fd7880, wire=32e8, old_wire=32e9, EO diff (use NEW) > 87.815278 QH=e2fd7880 next(0) fn=32e8, sch=32e7=>32e8 (+1) miss=0 > 87.815280 QH=e2fd7880 IMM ready fn=32e8, nxt=32e8 > 87.815390 QH=e2fd7880 next(0) fn=32e9, sch=32e8=>32e9 (+1) miss=0 > 87.815391 QH=e2fd7880 IMM ready fn=32e9, nxt=32e9 > 87.815491 QH=e2fd7880 next(0) fn=32ea, sch=32e9=>32ea (+1) miss=0 > 87.815493 QH=e2fd7880 IMM ready fn=32ea, nxt=32ea > 87.815635 QH=e2fd7880 next(0) fn=32eb, sch=32ea=>32eb (+1) miss=0 > 87.815638 QH=e2fd7880 IMM ready fn=32eb, nxt=32eb > > > Note that with my TEST-ONLY patch the old way is still _slightly_ > different in that I still communicate back to the scheduler with: > > chan->qh->next_active_frame = now_frame; > > The old code didn't used to do that. If I don't do that then you > you'll just stay in an inconsistent state for a while where things are > going on the wire 1 frame later than we think they are. > > > Also note that above you can see that the new way is indeed able to > schedule things in the current microframe. Looking one line at a > time: > > > 87.815012 QH=e2fd7880 IMM ready fn=32e6, nxt=32e6 > > QH e2fd7880 is going straight to the ready queue. Actual frame number > in hardware is 32e6. next_active_frame = 32e6 which means we ideally > want to give it to hardware in 32e6 and wire frame is 32e7. > > > 87.815220 QH=e2fd7880 next(0) fn=32e8, sch=32e6=>32e7 (+1) miss=0 > 87.815222 QH=e2fd7880 IMM ready fn=32e8, nxt=32e7 > > Frame number in hardware is now 32e8. We'd like to give the next > transfer to hardware in 32e7 to transfer on the wire at 32e8, but > that's obviously impossible. We will try to give it right away. > > > 87.815230 QH=e2fd7880, wire=32e8, old_wire=32e9, EO diff (use NEW) > > Showing a difference in the old way. We'll choose "even" to have the > packet go on the wire (expecting 32e8). > > > 87.815278 QH=e2fd7880 next(0) fn=32e8, sch=32e7=>32e8 (+1) miss=0 > 87.815280 QH=e2fd7880 IMM ready fn=32e8, nxt=32e8 > > We got a response back and are ready to schedule the next transfer and > it's still 32e8! That means that transfer must have happened (as > expected) in 32e8. Whew! Give the next transfer to hardware hoping > for 32e9 wire. > > > 87.815390 QH=e2fd7880 next(0) fn=32e9, sch=32e8=>32e9 (+1) miss=0 > > Now at hardware 32e9 and ready to schedule the next... > > > >> In dwc_otgbook, Interrupt OUT Transactions(also similar for Int IN, Iso >> IN/OUT) >> in DMA Mode, the normal Interrupt OUT operation says: >> The DWC_otg host attempts to send out the OUT token in the beginning of next >> odd frame/microframe. >> >> So I'm confuse about if the dwc core can do the transaction at the same >> frame >> of host channel initialized or not. > The docbook is obviously way too terse here, but the above experiment > shows that the hardware is designed in the only sane way that it could > be designed. > > Why do I say that this is the only sane way for the hardware to work? > I think all the following is true (please correct any errors): > > A) HW only lets you specify even/odd which means you choose between > two frame to send the packet. Two possible ways HW could be > implemented: "sane" way means you can send a packet in frame "x" and > "x + 1". "insane" way means you can send a packet in frame "x + 1" > and "x + 2" but not frame "x" > > B) In some cases (especially with regards to SPLIT transfers), we need > to use the result of a transfer in uFrame "x" to decide what to do > about uFrame "x + 1". Specifically for IN transfers I think we can't > know for sure whether we'll get back all of our data in uFrame "x" or > whether we'll only get part of the data and need uFrame "x + 1". > > C) It's possible to schedule 100us worth of periodic transfers in one > 125us uFrame. > > D) We can't know the result of a transfer until that transfer is done. > > > So above basically means that we might have a periodic transfer where > we get the result of the transfer 100us into a uFrame. We've now got > to quickly queue up the transfer for the next uFrame. If hardware was > designed in the "insane" way then we'd need an interrupt latency of < > 25 us since once the frame ticked we'd no longer be able to schedule. > If hardware was designed in the "sane" way then we'd "only" need an > interrupt latency of 125 us since we could continue to schedule even > partway through the current frame. > > Also note that if there's any chance that a periodic transfer ends > later than 100 us into a frame (like if a non-periodic transfer snuck > in there because we were out of periodic channels) then the above > problem becomes even more extreme. > > > > -Doug > > _______________________________________________ > Linux-rockchip mailing list > Linux-rockchip at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-rockchip >