On 09/28/2013 09:57 PM, Alan Stern wrote:
On Sun, 29 Sep 2013, Arokux X wrote:
What happens if you back-port your glue driver to the vendor's kernel?
I have now 200 lines of code which are (almost) identical. They work
in vendor's kernel and fail in mainline.
This indicates that the problem isn't in your glue driver, but is
somewhere else in the kernel.
This isn't surprising. The errors you are getting are hardware errors,
not protocol errors. They could be caused by excessive noise in the
USB data lines. Or there could be some sort of timing issue.
I've noticed there is ehci-timer.c now. It wasn't present at 3.4
times. The main clock of the SoC I'm working on is running at 24Mhz.
There are no hstimers implemented. Do you think it can be the problem?
I tend to doubt it. In the traces you posted, almost every transfer
worked. Only a few of them got errors. If timers were a problem then
none of the transfers would have worked.
A have tested several other USB devices: keyboard and Ethernet
adapter. They worked. I'm not sure whether these test are "clean"
since I connect them to the USB ports which are behind on-board 4-port
USB hub. The hub is connected to the first USB host controller. The
wifi module however is connected to the second USB host controller
directly.
Do you have any ideas how can I troubleshoot this issue further? Is
there any chance of regression in EHCI stack?
There's always a chance of a regression.
Can you try connecting the WiFi module to a regular PC? Maybe that
will provide some new ideas.
In your traces, the working case had about 200 us between each URB
completion and the following submission, whereas the non-working case
had much less time -- around 20 us. Maybe in the new kernel, the bulk
transfers occur too rapidly for the WiFi module to handle.
Or maybe the problem lies in the EHCI hardware. Have you checked for
errata?
On PCs, the driver has some problems with stability of the radio connections,
and I am currently working on that problem; however, there are no difficulties
in communicating over the USB system.
One problem that was recently discovered on ARM architecture is that the private
area at the end of the main structure was not aligned. The fix has been added to
the wireless-testing tree as commit 60ce314d1750fef. It is on its way to
mainline, but is not there yet. Fortunately, it is a one-liner as follows:
diff --git a/drivers/net/wireless/rtlwifi/wifi.h
b/drivers/net/wireless/rtlwifi/wifi.h
index cc03e7c..7032587 100644
--- a/drivers/net/wireless/rtlwifi/wifi.h
+++ b/drivers/net/wireless/rtlwifi/wifi.h
@@ -2057,7 +2057,7 @@ struct rtl_priv {
that it points to the data allocated
beyond this structure like:
rtl_pci_priv or rtl_usb_priv */
- u8 priv[0];
+ u8 priv[0] __aligned(sizeof(void *));
};
#define rtl_priv(hw) (((struct rtl_priv *)(hw)->priv))
The patch is line-wrapped, and probably has its white space mangled by this
route of transmission, but it is easily applied manually.
My suspicion is that this fix is not your problem, but I know of no other driver
problems that might affect your SoC.
Larry
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html