Hi
On 11-10-2022 11:21, Andy Shevchenko wrote:
On Mon, Oct 10, 2022 at 02:40:30PM -0700, Andrey Smirnov wrote:
On Mon, Oct 10, 2022 at 12:13 AM Andy Shevchenko
<andriy.shevchenko@xxxxxxxxxxxxxxx> wrote:
On Sun, Oct 09, 2022 at 10:02:26PM -0700, Andrey Smirnov wrote:
On Fri, Oct 7, 2022 at 6:07 AM Ferry Toth<fntoth@xxxxxxxxx> wrote:
...
OK, Ferry, I think I'm going to need clarification on specifics on
your test setup. Can you share your kernel config, maybe your
"/proc/config.gz", somewhere? When you say you are running vanilla
Linux, do you mean it or do you mean vanilla tree + some patch delta?
The reason I'm asking is because I'm having a hard time reproducing
the problem on my end. In fact, when I build v6.0
(4fe89d07dcc2804c8b562f6c7896a45643d34b2f) and then do a
git revert 8bd6b8c4b100 0f0101719138 (original revert proposed by Andy)
I get an infinite loop of reprobing that looks something like (some
debug tracing, function name + line number, included):
Yes, this is (one of) known drawback(s) of deferred probe hack. I think
the kernel that Ferry runs has a patch that basically reverts one from
2014 [1] and allows to have extcon as a module. (1)
[1]: 58b116bce136 ("drivercore: deferral race condition fix")
which renders the system completely unusable, but USB host is
definitely going to be broken too. Now, ironically, with my patch
in-place, an attempt to probe extcon that ends up deferring the probe
happens before the ULPI driver failure (which wasn't failing driver
probe prior tohttps://lore.kernel.org/all/20220213130524.18748-7-hdegoede@xxxxxxxxxx/),
there no "driver binding" event that re-triggers deferred probe
causing the loop, so the system progresses to a point where extcon is
available and dwc3 driver eventually loads.
After that, and I don't know if I'm doing the same test, USB host
seems to work as expected. lsusb works, my USB stick enumerates as
expected. Switching the USB mux to micro-USB and back shuts the host
functionality down and brings it up as expected. Now I didn't try to
load any gadgets to make sure USB gadget works 100%, but since you
were saying it was USB host that was broken, I wasn't concerned with
that. Am I doing the right test?
Hmm... What you described above sounds more like a yet another attempt to
workaround (1). _If_ this is the case, we probably can discuss how to fix
it in generic way (somewhere in dd.c, rather than in the certain driver).
No, I'm not describing an attempt to fix anything. Just how vanilla
v6.0 (where my patch is not reverted) works and where my patch, fixing
a logical problem in which extcon was requested too late causing a
forced OTG -> "gadget only" switch, also changed the ordering enough
to accidentally avoid the loop.
You still refer to a fix, but my question was if it's the same problem or not.
That said, the real test case should be performed on top of clean kernel
before judging if it's good or bad.
Given your level of involvemnt with this particular platform and you
being the author of
https://github.com/edison-fw/meta-intel-edison/blob/master/meta-intel-edison-bsp/recipes-kernel/linux/files/0043b-TODO-driver-core-Break-infinite-loop-when-deferred-p.patch
I assumed/expected you to double check this before sending this revert
out. Please do so next time.
As I said I have not yet restored my testing environment for that platform and
I relied on the Ferry's report. Taking into account the history of breakages
that done for Intel Merrifield, in particular by not wide tested patches
against DWC3 driver, I immediately react with a revert. I agree that I had had
to think about that ordering issue and a patch on top of it first. I thought
that Yocto configuration that Ferry is using is clean of custom (controversial)
patches like that. Now, since you have this platform, you can also help with
testing the DWC3 on it.
It's my fault I'm afraid. My apologies. We have been needing the "Break
infinite loop" patch at least since v5.10. Without IIRC we can not even
boot normally or at least we have no dwc3. No way to bisect. In my mind
I didn't link that patch to this issue. Other patches were indeed
removed during bisecting.
But it is interesting that we should be able to drop that patch for
v6.0, even if that is a side effect. I will try to reproduce and report
back here.