On Mon, Jul 22, 2013 at 11:09 PM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > On Mon, 22 Jul 2013, Ming Lei wrote: > >> > Not exactly. The HCD has to keep track of the next available time slot >> > for the endpoint. Some special value, like -1, will indicate that no >> > stream is running yet. So the new function will merely have to set the >> > slot value to -1. >> >> I am wondering if -1 is special enough, since in theory -1 might >> be a valid time slot when HCD uses time slot ANDed with (mod - 1). > > On EHCI, valid time slots run from 0 to 8191 (or less). On OHCI they > run from 0 to 65535. On UHCI they run from 0 to 1023. There are also lots of non-standard HCs, and going to be many in future. When one new API is introduced, maybe we can't ignore these HCs, also need consider coming HCs. > >> >> If we define the API as usb_iso_stream(udev, ep, start), hcd only >> >> need to read the flag, so it should be simper to use and implement. >> > >> > It wouldn't be simpler for drivers. And what happens if the HCD gets >> > an URB submitted for an endpoint where the stream_flag is off? >> >> If the API is defined as usb_iso_stream(udev, ep, start), drivers are required >> to call usb_iso_stream(udev, ep, 1) before starting stream, and call >> usb_iso_stream(udev, ep, 0) after stopping stream. > > This just adds complexity with no benefit. The symmetrical(start, stop) APIs may be easy to understand and use, and looks no extra complexity introduced. One benefit is that we may remove the assumption of -1 being invalid frame/uframe on all HCs, another one is that HCD can know clearly if stream is on or off currently, which might be helpful in the future. > >> >> Also with both streaming on and off information, the HCD might improve >> >> bandwidth management in the future. >> > >> > I doubt it very much. The USB spec states that bandwidth for periodic >> > endpoints gets allocated whenever a new altsetting is installed, and >> > that is the approach the HCDs will try to take. >> >> Looks USB spec doesn't mention part of the bandwidth allocated by >> set_interface can't be freed. > > It is implicit. If bandwidth gets allocated then drivers can assume it > is available. If it were then freed by the HCD, it might not be > available when a driver wanted to use it. > >> Also Clemens mentioned, there might be devices which exposes non-zero >> payload sizes in the default interface setting. > > No problem. The bandwidth will be allocated when the default interface > setting is installed. > >> >> I mean the HCD private lock, and there are heavy contention >> >> on the lock. >> > >> > I don't have any ideas how to reduce that contention. Do you? >> >> For ehci example, one draft idea I thought about is to split the ehci->lock >> into three locks: ehci->periodic_lock which covers intr & isoc transfer, and >> ehci->async_lock which covers async schedule, and ehci->lock which >> manages some global thing but is held with short time. >> >> But the idea is far more mature, and might be wrong, and only for discussion >> because I haven't considered much details already. > > Okay. > >> >> Also I thought of one improvement on the idea of 'urb->completing': >> >> >> >> - define one percpu 'struct urb *' variable inside 'struct hcd' to record >> >> the completing urb, so HCD can see easily if the urb is being resubmitted >> >> inside its completion handler. >> > >> > Even on a UP system, what happens if multiple URBs get completed before >> > the tasklet runs and one of them is resubmitted? >> >> On each CPU, for one HCD, one urb->complete() won't be preempted >> by any other urb->complete() either running in tasklet or hardirq handler, >> so we can record the completing urb in one HCD percpu variable before >> calling urb->complete() and clear the variable after returning form >> urb->complete(). Then in submit path, we can find if the URB is scheduled >> in its own completion handler. > > That might work. (What if the thread is preempted while the completion > handler is running and then continues on a different CPU?) But I think It works under the situation because both interrupt handler and tasklet can't be preempted. If the thread of urb submission is preempted, that means it isn't run in completion handler, so no matter if it is migrated into other CPUs, the approach can work well. > it would still have higher overhead than calling usb_new_iso_stream > once, whenever a new stream is started. Yes, the approach introduces overhead of two stores to percpu variable in complete() path. But it can't replace the API because it can't work out in case that urb is resubmitted in tasklet or workqueue whch is scheduled in completion handler. The approach is only helpful if most of isoc drivers need to use the new API. Considered most of drivers resubmit isoc URBs inside completion handler, we can use this automatic way to decide new stream for these drivers and avoid lots of changes. Thanks, -- Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html