On Sun, Aug 25, 2013 at 11:39 PM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > On Sat, 24 Aug 2013, Ming Lei wrote: > >> >> >> For other HCDs, this changes on interrupt QH unlink may not need at all >> >> >> if there is no much cost about relink. >> >> > >> >> > But for some HCDs, it will be needed. >> >> >> >> Which HCDs need them? >> > >> > I haven't checked in detail. It seems likely that uhci-hcd and >> > ohci-hcd will need changes. >> >> Are you sure? If the change is absolutely required, there is certainly >> bug in them, since HCDs still need to support submitting interrupt URB >> in tasklet scheduled by its complete() handler? > > Yes, I am sure, and no, there is no bug. > >> >> Even for EHCI HCD, the interrupt QH unlink change isn't necessary, and >> >> the patch is only for improving the situation, isn't it? For other HCDs, >> >> basically unlink interrupt queue need't the wait time, so don't need the change. >> > >> > Wrong: For other HCDs, unlinks _do_ need to wait for the hardware. >> >> Just take a quick look at UHCI code, looks there is no much >> difference about unlinking qh between async qh and intr qh, which >> means the change might not need for uhci since usb-storage is >> common case to support. Correct me if it is wrong. > > The problem is that once an interrupt QH has been unlinked, relinking > it might not make it visible to the hardware until the next frame > starts. Therefore interrupt endpoints with an interval of 1 ms would > not work correctly; they would get unlinked and relinked in every > second frame. Suppose so, I ague it still need be fixed without the giveback URB in tasklet patches, since HCDs should support submitting intr URB not from completion directly(tasklet, workqueue, or simple task). So sort of the change on interrupt unlinking isn't only for supporting giveback URB in tasklet, but also a fix on current HCDs. Also, the 1ms interval delay problem might exist on your RFC approach, when the tasklet schedule delay is a bit long, the resubmitting is still very late than the complete time from view of hardware. > >> >> As I said, the only change introduced with running giveback in tasklet >> >> is that iso_stream_schedule() may see list_empty(&stream->td_list) >> >> when underrun, and we can handle it inside HCD and usbcore, nothing >> >> to do with drivers. >> > >> > That's not true, at least, not for ehci-hcd. >> >> Could you explain it in detail? I mean we only discuss the change on isoc >> schedule in case of underrun when giveback in tasklet. > > The code in iso_stream_schedule() knows that any URB scheduled to end > before ehci->last_iso_frame will already have been given back, and any Yes, the patch doesn't change the fact. > resubmissions by the completion handler will have occurred already. > That's what this comment means: > > /* > * Use ehci->last_iso_frame as the base. There can't be any > * TDs scheduled for earlier than that. > */ > > But with your tasklet, this isn't true any more. Even though the It is still true, with the giveback urb in tasklet patch, ehci->last_iso_frame is updated to 'now' in scan_isoc(), then URBs scheduled in tasklet is always after 'now' in scan_isoc(). > driver has called usb_hcd_giveback_urb, the completion handler may not > have run yet. Also even without my patches, HCD should support isoc URBs submitted by tasklet or workqueue which is scheduled in its complete(), right? If you think there is problem caused by the patchset, the problem should have been existed in HCDs already, then it is unfair to count the change on my patch, :-) > >> >> OK, I give you a simple solution in which only one line change is >> >> needed for these HCDs, see attachment patch. >> > >> > The patch is wrong. It keeps track of whether the URB is being >> > resubmitted by the completion handler, not whether the endpoint queue >> >> list_empty(&stream->td_list) is already checked in iso_stream_schedule(), >> so I am wondering what is the 'wrong' you mean. > > Hmmm. For one thing, the patch doesn't include a definition of > hcd_is_urb_completing(). For another, it won't work if the driver Sorry for missing it. > keeps a pool of URBs and shares them among multiple endpoints. It is still easy to extend my patch to support the case, just store the 'urb->ep' into percpu variable and compare it with resubmited urb->ep because urb->ep has to be correct before completing and during submitting. See the attachment patch. > >> We need to know if the empty endpoint queue is caused by the last >> completed URB in queue, which will be resubmitted later. >> >> > has emptied out. What happens if the completion handler for URB0 >> > submits URB1? >> >> Do you have such example? > > I posted a proposal for exactly this sort of thing a few weeks ago. > The idea was that the URBs have to vary in size, depending on what the > device is doing, so you can't predict in advance how much data a single > URB will contain. But the driver wants to keep about the same amount > of data on the output queue at all times. Therefore the completion > handler will sometimes not submit any URBs, and sometimes it will > submit more than one. OK, got it, and my attachment patch still supports your case. > >> It is possible for the two URBs which belong to >> different endpoint and let one of URBs to start the stream, but I am wondering >> it is reasonable that the two belong to same endpoint? Even for this case, >> we can handle it easily by checking if the two URBs belong to same >> endpoint. > > There's a much simpler way to solve this problem. I will post an RFC > later on. Actually, my attachment patch is only about dozens of lines(17+, 1-), so sounds the problem isn't difficult to solve finally because of your simpler one, :-) > >> > To do it properly, you would need to count the number of URBs on the >> > endpoint queue. The count would be decremented after the completion >> > handler returned, which means it probably would have to be an atomic_t >> > variable -- and I know that you hate those! >> >> percpu variable should be ok since no preemption is possible during >> completing? and no irq handler may operate the variable. > > Percpu variables are not always the best answer to everything! :-) I agree, but depends on situation, :-) > >> > Besides, even if your patch was correct, it still wouldn't be >> > sufficient. ehci-hcd relies on the fact that the entries in a >> > non-empty isochronous queue expire on or after ehci->last_iso_frame. >> > With your patch, that wouldn't be true any more. >> >> Sorry, I don't understand your point, as I said, the only change on ISOC >> schedule introduced with running giveback in tasklet is that >> iso_stream_schedule() may see list_empty(&stream->td_list) when >> underrun, and the above discussed patch should address this one. >> >> If you mean the delay introduced by tasklet, I don't think there is no >> irq handling delay. >> >> If you mean other things, could you explain in a bit detail please? > > Suppose an URB is submitted while the endpoint queue is non-empty, so > it is scheduled to start in the next slot after the end of the last URB > in the queue. Does that next slot occur after the frame stored in > ehci->last_iso_frame? I think it is yes, actually the new slot is always after current uframe during iso_stream_schedule from tasklet, and it is behind ehci->last_iso_frame which is updated in ehci_irq path. > > With the old driver it always does, because the fact that the last URB > hasn't been given back yet means that the last URB ends after > last_iso_frame. Therefore the new URB will begin after last_iso_frame. If new slot of new URB is always after current uframe, there isn't the problem. > > With the new driver it doesn't, because the endpoint queue will remain > non-empty for some time after ehci-hcd calls giveback_urb for the last > URB. As a result, the TDs for the new URB might not get scanned, > because the scanning starts from the last_iso_frame position. Looks it is impossible, see above. > When this happens, ehci-hcd has to take special care not to link those > TDs. I have written a patch to do this, and I will post it along with > the RFC mentioned above. > >> >> ehci->async_unlinking >> >> ehci->intr_unlinking >> >> qh->exception >> >> QH_STATE_COMPLETING >> > >> > Each of those uses about three lines of code. And you're wrong about >> > qh->exception; that one is needed (or will be needed) for other >> > reasons. >> >> No, with my patches, about 50 line codes can be saved. Also it may be not >> related code decrease only, it is about race-killing, and you know people >> always hate races, :-) > > The races have been taken care of for years. I prefer not to remove > the code that handles those races, because there's a good chance it > will be needed in the future. (For example, if someone decides to move > ehci_work into the bottom half, as in my previous RFC.) OK, I will hold them until we can get kind of conclusion about comparison on the two approaches. > >> > It turns out that what I did in the RFC wasn't a valid solution, >> > though. It will be necessary to make sure that givebacks never happen >> > in an unlink call or timer handler; the tasklet will have to take care >> > of them instead. That will complicate the patch, unfortunately. >> >> Yes, imagine other HCDs still need to deal with similar & complicated >> things, so why don't we deal with most of things in usb core like only >> giveback in tasklet patches? > > I wish we could, but we can't! All these other changes needed in > ehci-hcd are proof of this -- they are things that _can't_ be dealt > with in usbcore. > >> For ehci HCD example, ehci->need_rescan is set and returns when >> HCD is scanning, so the problem is avoided, but it is implemented in >> HCD code, and other HCDs may need to solve the same problem >> with similar trick too.(UHCI uses similar trick) > > Of course they do. If they haven't solved this problem already then > they are buggy, and have been buggy for a long time. > >> I mean we can do it in usb core like giveback in tasklet patches, >> so HCDs won't worry about the problem any more, isn't it a >> simplification for HCDs? > > Yes, it is. But at the cost of making other things more complicated -- > such as the way interrupt and isochronous URBs are handled in ehci-hcd. As I said, these other things are not only for giveback URB in tasklet patches, but also fixes on current HCDs, which means they should have been done no matter my patches are posted or not, :-) Thanks, -- Ming Lei
Attachment:
tracing-compleing-urb.patch
Description: Binary data