On Tue, Apr 13, 2010 at 08:37:16PM +0530, Ramya Desai wrote: > On Tue, Apr 13, 2010 at 1:48 AM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > > >> I never saw the mass storage driver enqueue more sg entries than > >> TRBS_PER_SEGMENT. It's just Ramya's driver that has this behavior. At > >> this point, we can't tell what he's doing without his source code. > > > > This directly contradicts Ramya's statement above: "When I am testing > > the default usb storage driver with 960 max_sectors Sarah saw ~120 > > scatter-gather list entries." > > > > Evidently the two of you need to figure out exactly what happened. > > More tests can't hurt... > > Sorry to confuse you but please find the scenario that resulted in the > above discussion. Initially, I was working on a UASP driver and was > seeing a timeout issue. So, during my debug process, I also tested the > default mass storage driver by increasing the max sectors to 960, > since our device is capable of supporting it. Here, I encountered a > similar issue and sent my logs to Sarah, requesting her help. At that > point, the logs had indicated that scatter gather list had 73 entries > (>63). > > Later, I used the Sarah Sharp “xhci-large-tx” branch where > TRBS_PER_SEGMENT is set to 128, which was thought about as a temporary > fix. That is the reason I saw 120 scatter-gather list entries. > However, today I tested the default usb storage driver for more than > 10 times by setting the TRBS_PER_SEGEMENT to 64 and I am unable to see > sg_list entries greater than 63. When I changed TRBS_PER_SEGMENT to > 128 then I am seeing 120 sg_list entries. Ok, I think I understand what happened. The xhci-streams branch you were testing against was based against 2.6.32 (the original one that came out, not any of the stable versions that followed). That kernel didn't have David Vrabel's patch to make the URB scatter gather support more generic (commit 4c1bd3d7a7d114dabd58f62f386ac4bfd268be1f), and thus the USB mass storage driver didn't check bus->sg_tablesize. When you modified max_sectors, the mass storage driver would blindly enqueue a scatter gather list with more entries than the xHCI driver could handle. We never saw this issue before because the max_sectors was always low. You didn't see this issue on the xhci-large-tx branch because it's based on 2.6.34-rc2. I don't think we need to patch the stable kernel series, because it won't have a driver that triggers this bug, and we won't allow new drivers to go into the stable kernels. For you, I suggest you base all your UASP work against the latest kernel. I'm going to delete the old xhci-streams branch (based against 2.6.32) and rename xhci-streams-rebase (based against 2.6.34-rc2) to xhci-streams. By goodness, I want the streams code merged already! > But, that still leaves my issue in the unresolved state, since I am > unable to do large transfers consistently (with default mass storage > driver) when max_sectors is set to 960. I tried the same with EHCI > driver (using USB2.0 port) and it works fine without any issue. I aslo > tried with USB 2.0 cable by connecting it to USB 3.0 port and it works > fine without any issues. So, I guess it should either be a hardware > issue or maybe something related to xHCI driver. I can send the latest > logs, if needed. Please don't send logs just yet. Are you running the mass storage driver on your own hardware, or someone else's USB3 storage device? If you're using your own hardware, are you able to transfer large sglists when you try a different device? By "do large transfers consistently", do you mean that the host controller "dies", meaning it disconnects from the PCI bus? I've looked at your logs for when that happens, and I can't discern any patterns. All the information the software is sending to the xHCI host controller looks correct. The hardware is just flaky. I can consistently transfer large scatter gather lists with max_sectors set to 960 with my setup with the Ratoc Express Card xHCI host and the SIIG 2.5" drive enclosure, so it may be an issue with your host controller or your device. Please rule out your device by testing with a different USB3 mass storage device with the same kernel and host. If your host controller is at fault, it may be necessary for the xHCI driver to limit the number of sglist entries so your buggy host controller doesn't crash. Can you experiment with changing this line in xhci-pci.c: hcd->self.sg_tablesize = TRBS_PER_SEGMENT - 1; Try leaving TRBS_PER_SEGMENT set to 64, and then modifying this line to limit the sglist to something small, like 10 or 25. Run the default USB mass storage driver and see if this resolves the HC died issue for you. If it does, try to find the maximum size that makes the host not die. Sarah Sharp -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html