Hi
Op 10-10-2022 om 13:04 schreef Ferry Toth:
Hi
On 10-10-2022 07:02, Andrey Smirnov wrote:
On Fri, Oct 7, 2022 at 6:07 AM Ferry Toth <fntoth@xxxxxxxxx> wrote:
On 07-10-2022 04:11, Thinh Nguyen wrote:
On Thu, Oct 06, 2022, Ferry Toth wrote:
Hi
On 06-10-2022 04:12, Thinh Nguyen wrote:
On Wed, Oct 05, 2022, Ferry Toth wrote:
Hi,
Thanks!
Does the failure only happen the first time host is
initialized? Or can
it recover after switching to device then back to host mode?
I can switch back and forth and device mode works each time,
host mode remains
dead.
Ok.
Probably the failure happens if some step(s) in
dwc3_core_init() hasn't
completed.
tusb1210 is a phy driver right? The issue is probably
because we didn't
initialize the phy yet. So, I suspect placing
dwc3_get_extcon() after
initializing the phy will probably solve the dependency
problem.
You can try something for yourself or I can provide
something to test
later if you don't mind (maybe next week if it's ok).
Yes, the code move I mentioned above "moves dwc3_get_extcon()
until after
dwc3_core_init() but just before dwc3_core_init_mode(). AFAIU
initially
dwc3_get_extcon() was called from within dwc3_core_init_mode()
but only for
case USB_DR_MODE_OTG. So with this change order of events is
more or less
unchanged" solves the issue.
I saw the experiment you did from the link you provided. We want
to also
confirm exactly which step in dwc3_core_init() was needed.
Ok. I first tried the code move suggested by Andrey (didn't work).
Then
after reading the actual code I moved a bit further.
This move was on top of -rc6 without any reverts. I did not make
additional
changes to dwc3_core_init()
So current v6.0 has: dwc3_get_extcon - dwc3_get_dr_mode - ... -
dwc3_core_init - .. - dwc3_core_init_mode (not working)
I changed to: dwc3_get_dr_mode - dwc3_get_extcon - .. -
dwc3_core_init - ..
- dwc3_core_init_mode (no change)
Then to: dwc3_get_dr_mode - .. - dwc3_core_init - .. -
dwc3_get_extcon -
dwc3_core_init_mode (works)
.. are what I believe for this issue irrelevant calls to
dwc3_alloc_scratch_buffers, dwc3_check_params and dwc3_debugfs_init.
Right. Thanks for narrowing it down. There are still many steps in
dwc3_core_init(). We have some suspicion, but we still haven't
confirmed
the exact cause of the failure. We can write a proper patch once we
know
the reason.
If you would like me to test your suspicion, just tell me what to do
:-)
OK, Ferry, I think I'm going to need clarification on specifics on
your test setup. Can you share your kernel config, maybe your
"/proc/config.gz", somewhere? When you say you are running vanilla
Linux, do you mean it or do you mean vanilla tree + some patch delta?
For v6.0 I can get the exacts tonight. But earlier I had this for v5.17:
https://github.com/htot/meta-intel-edison/blob/master/meta-intel-edison-bsp/recipes-kernel/linux/linux-yocto_5.17.bb
There are 2 patches referred in #67 and #68. One is related to the
infinite loop. The other is I believe also needed to get dwc3 to work.
All the kernel config are applied as .cfg.
Patches and cfs's here:
https://github.com/htot/meta-intel-edison/tree/master/meta-intel-edison-bsp/recipes-kernel/linux/files
Updated Yocto recipe for v6.0 here:
https://github.com/htot/meta-intel-edison/blob/honister/meta-intel-edison-bsp/recipes-kernel/linux/linux-yocto_6.0.bb
#75-#77 are the 2 reverts from Andy, + one SOF revert (not related to
this thread).
Otherwise via the git route, https://github.com/andy-shev/linux should
lead to the same, although you might want to drop "WIP: serial:
8250_dma: use sgl on transmit "
The reason I'm asking is because I'm having a hard time reproducing
the problem on my end. In fact, when I build v6.0
(4fe89d07dcc2804c8b562f6c7896a45643d34b2f) and then do a
git revert 8bd6b8c4b100 0f0101719138 (original revert proposed by Andy)
I get an infinite loop of reprobing that looks something like (some
debug tracing, function name + line number, included):
[ 6.160732] tusb1210 dwc3.0.auto.ulpi: error -110 writing val 0x41
to reg 0x80
[ 6.172299] XXXXXXXXXXX: dwc3_probe 1834
[ 6.172426] XXXXXXXXXXX: dwc3_core_init_mode 1386
[ 6.176391] XXXXXXXXXXX: dwc3_drd_init 593
[ 6.181573] dwc3 dwc3.0.auto: Driver dwc3 requests probe deferral
[ 6.191886] platform dwc3.0.auto: Added to deferred list
[ 6.197249] platform dwc3.0.auto: Retrying from deferred list
[ 6.203057] bus: 'platform': __driver_probe_device: matched device
dwc3.0.auto with driver dwc3
[ 6.211783] bus: 'platform': really_probe: probing driver dwc3 with
device dwc3.0.auto
[ 6.219935] XXXXXXXXXXX: dwc3_probe 1822
[ 6.219952] XXXXXXXXXXX: dwc3_core_init 1092
[ 6.223903] XXXXXXXXXXX: dwc3_core_init 1095
[ 6.234839] bus: 'ulpi': __driver_probe_device: matched device
dwc3.0.auto.ulpi with driver tusb1210
[ 6.248335] bus: 'ulpi': really_probe: probing driver tusb1210 with
device dwc3.0.auto.ulpi
[ 6.257039] driver: 'tusb1210': driver_bound: bound to device
'dwc3.0.auto.ulpi'
[ 6.264501] bus: 'ulpi': really_probe: bound device
dwc3.0.auto.ulpi to driver tusb1210
[ 6.272553] debugfs: Directory 'dwc3.0.auto' with parent 'ulpi'
already present!
[ 6.279978] XXXXXXXXXXX: dwc3_core_init 1099
[ 6.279991] XXXXXXXXXXX: dwc3_core_init 1103
[ 6.345769] tusb1210 dwc3.0.auto.ulpi: error -110 writing val 0x41
to reg 0x80
[ 6.357316] XXXXXXXXXXX: dwc3_probe 1834
[ 6.357447] XXXXXXXXXXX: dwc3_core_init_mode 1386
[ 6.361402] XXXXXXXXXXX: dwc3_drd_init 593
[ 6.366589] dwc3 dwc3.0.auto: Driver dwc3 requests probe deferral
[ 6.376901] platform dwc3.0.auto: Added to deferred list
which renders the system completely unusable, but USB host is
definitely going to be broken too. Now, ironically, with my patch
in-place, an attempt to probe extcon that ends up deferring the probe
happens before the ULPI driver failure (which wasn't failing driver
probe prior to
https://lore.kernel.org/all/20220213130524.18748-7-hdegoede@xxxxxxxxxx/),
there no "driver binding" event that re-triggers deferred probe
causing the loop, so the system progresses to a point where extcon is
available and dwc3 driver eventually loads.
After that, and I don't know if I'm doing the same test, USB host
seems to work as expected. lsusb works, my USB stick enumerates as
expected. Switching the USB mux to micro-USB and back shuts the host
functionality down and brings it up as expected. Now I didn't try to
load any gadgets to make sure USB gadget works 100%, but since you
were saying it was USB host that was broken, I wasn't concerned with
that. Am I doing the right test?
For the reference what I test with is:
- vanilla kernel, no patch delta (sans minor debug tracing) + initrd
built with Buildroot 2022.08.1
- Initrd is using systemd (don't think that really matters, but who
knows)
- U-Boot 2022.04 (built with Buildroot as well)
- kernel config is x86_64_defconfig + whatever I gathered from *.cfg
files in
https://github.com/edison-fw/meta-intel-edison/tree/master/meta-intel-edison-bsp/recipes-kernel/linux/files