From: Sarah Sharp > Hi David, > > I've been thinking about this some more, and I'd like to propose a much > simpler solution. > > The TD fragment rules didn't go into the xHCI specification until the > 1.0 revision. The code that follows those rules only seems to trigger > issues for 0.96 ASMedia hosts, while Intel 1.0 hosts benefit from the > code when USB ethernet devices are attached. The patch that changed the > link TRB behavior (commit 6c12db90f19727c76990e7f4801c67a148b30111) was > submitted in 2010, and the xHCI 1.0 spec didn't come out until later > that year, so it's unlikely that Synopsys host is a 1.0 host either. > > So why not just limit your code (and the sg table entry limit) to only > trigger for 1.0 hosts? That fix would be much less complex than this > one. I would still consider the other cleanup patches on to of it for > inclusion in 3.15, although it looks like patches 4 and 5 rely on this > patch. My suspicion is that it is possible to lock up 0.96 ASMedia hosts (or other hosts that need the link TRB behaviour change) without my patch that adds NOPs to the ring. It will happen much less often than when the NOPs are added, but it can still happen. If a TD starts with a LINK TRB then the LINK TRB is the last one that is owned by the controller until the setup completes. If the controller finishes the previous transfer while the while the setup is still in progress then it will process the LINK TRB and we are in exactly the same situation as before patch 6c12db90f1. This actually means that patch 6c12db90f1 didn't completely fix the problem. With the NOPs, the driver writes the first NOP at about the same time as the previous version would have written the LINK. The rest of the NOPs and the LINK are then probably written faster than the controller reads them. So I surmise that if the patch wrote a LINK instead of the first NOP then the controller would hang just as often. But this is the same as the case when no NOPs are written because the previous TD finished in the last slot. I know this is all failing with disk transfers which I believe are all single threaded. So maybe the LINK TRB that the controller is reading is one written in response to a previous transfer completing. This is a timing race which could be affected by small things (like putting the controller into a different computer). But I can't see how it is affected by the number of NOPs (including zero). A patch that only adds the NOPs for transfers that are expected to be misaligned (or are not known to be aligned - depending on which way you want to flag things) is ok by me. That would allow unlimited length aligned sg transfers on v1.00 hosts. David -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html