On Wed, Jun 15, 2011 at 04:39:22PM +0800, Xu, Andiry wrote: > > -----Original Message----- > > From: Sarah Sharp [mailto:sarah.a.sharp@xxxxxxxxxxxxxxx] > > Sent: Wednesday, June 15, 2011 11:32 AM > > To: Xu, Andiry > > Cc: linux-usb@xxxxxxxxxxxxxxx; matthew.r.wilcox@xxxxxxxxx > > Subject: Re: [PATCH 5/5] xHCI: assign devices to different > interrupters > > > > On Fri, Jun 10, 2011 at 04:51:43PM +0800, Andiry Xu wrote: > > > This patch assign devices to different interrupters based on its > > slot_id. > > > However, it's not mandatory that a device should use the same > > interrupter. > > > A device can assign its control transfer to one interrupter and > > bulk/intr/ > > > isoc transfer to another interrupter. The interrupter usage policy > is > > > flexible. > > > > > > The policy in this patch may be replaced by better approach in the > > future. > > > > I would like to get a more flexible policy, or at least the APIs to > > allow drivers or the USB core flexibility in directing URB traffic to > > different interrupters before this gets merged. I'm most concerned > > about allowing drivers or the USB core to change interrupt moderation > > rate the rings should use. Either that, or the xHCI driver needs to > > choose the moderation interval based on the endpoint type and polling > > rate. I think we need that, because I have doubts that these patches > > alone will improve performance. I don't really want to add this much > > optional code without some indication that it will help performance. > > > > Agreed. But does the IMODI field can be modified during running? The > spec > only says the IMODC field may be directly written by software at any > time > to alter the interrupt rate. I think you can only modify the IMODI field when the host controller is first set up, but I'll check with the xHCI spec architect. I suspect we could probably get away with modifying it, since hardware is supposed to load a counter with the value after every interrupt, but I'd rather not rely on hardware-specific behavior if the spec doesn't say it can be modified during run time. > > What about adding a field to the struct urb to allow the driver > > to set the interrupter field? We might have to add some USB core API > > to > > allow the driver to figure out how many interrupters are available. > If > > the driver doesn't set the field, we can just use the per-device > > basis. > > > > Adding a intr_target field to the URB? Then to set the field, usb core > not only need to know the number of interrupters enabled, but also needs > to find out the most suitable interrupter. Does every urb need to > calculate > the field before being submitted to the host driver? If so, I think it's > another performance lost. I think drivers probably know when they switch configurations or alternate interface settings what latency requirements they need. A webcam driver needs a latency less than the periodic rate of the isochronous endpoint. So we could have one call to allow the driver to set the interrupter once before it submits any URBs. The USB core could store that value in a per-driver structure, or the xHCI driver could store it in xhci_virt_ep. The xHCI driver could set up several different endpoint rings for periodic endpoints, and advertise their latencies (IMODI values) to the USB core. The USB core, based on the latency the driver needs, could set the driver's interrupter target once, or maybe once per alternate interface setting change. So that API design might avoid any per-URB performance hit. > > I think we'd have to restrict the driver to only using one interrupter > > per endpoint, unless streams are enabled. Otherwise I think it breaks > > the xhci_td list handling assumption that events are handed back in > the > > order they are placed on the ring, because we could have a race > > condition between two CPUs that receive interrupts for the same > > endpoint. You could set an initial interrupter used in the > > xhci_virt_ep > > structure, and then double check it whenever a new URB is queued for > > that endpoint. > > > > I think different kinds of eps can be assigned to different > interrupters. > For example, bulk ep can use an interrupter with lower interrupt rate. > Maybe xHCI driver can simply initialize interrupters for different kinds > of eps, and assign urbs to them. If the drivers don't use this new API, the xHCI driver can try to make an informed choice (once per endpoint during the configure endpoint command) to try and decide which interrupter would be best. At the very least, we should probably have separate event rings for different device speeds, and maybe two dedicated event rings for HS and SS isochronous traffic? We probably want a separate event ring just for control transfers, with the IMODI field set to zero. There's no sense in making the hardware delay control transfers. Of course, control transfers are rarely (perhaps never?) performance critical. Bulk endpoints are a separate issue. I think we always want to set the IMODI value to zero for those, because the drivers really want to get notified immediately when the transfers are done. Especially the BoT driver, that waits on each bulk transfer before submitting the rest. So we can probably direct control and bulk endpoints to the same event ring with IMODI set to zero. > > Then if each URB could have a different interrupter, the UAS driver > > could > > direct different stream transfers to different interrupters. > > > > Does it require to modify UAS driver? I'm not familiar with it. I don't think it necessarily has to. The xHCI driver could allocate many event rings dedicated to bulk streaming endpoints, and then just use the stream ID number to direct it to different rings. Maybe we should allocate one event ring for each of the number of bulk streams the host controller says it can support? Or twice the number, if we think we'll have multiple UAS devices attached to the system? But we could add an interrupter target to the URB (that the USB core doesn't touch), and allow the UAS driver to set the policy if it wishes in the future. It could want to bind a particular stream ID to a particular core, to make sure the data structures are in-cache. But then we'd need a way for the UAS driver to discover which interrupter is bound to which MSI-X vector and which core... I'm not sure if it's feasible. > BTW, I found that current xHCI driver initialize the interrupter's IMODI > Field with value 160. The default value is 4000. With the algorithm in > 5.5.2.2, we know it means 1000 interrupts per second. Yeah, that's not a very default good value. I asked Steve about it, and he said he asked Microsoft what was the default value for their interrupter on EHCI, and they said 1ms, and then later said, "Oh, but we don't really use that. We interrupt as often as possible." > However, 160 means 25000 interrupts per second. xHCI spec says, the > maximum observable interrupt rate from the xHC should never exceed > 8000 interrupts/sec. Oh, interesting, I hadn't caught that bit. ISTR that setting the IMODI value to zero instead of 160 actually did help performance on the USB 3.0 storage device I was testing with an NEC host. I'll have to rerun those tests some time. You might want to do some experiments on your own hardware. > Especially, 4.17.2 says, an initial suggested range for the moderation > Interval is 651-5580 (28Bh - 15CCh). > > So I wonder why the value 160 is chose, is there any special reason? An IMODI value of 4000 was giving me *horrible* performance, so I asked the USB-IF certification folks what they used, and just used that value. Sarah Sharp -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html