On 9/17/21 10:21 AM, Saravana Kannan wrote: > On Fri, Sep 17, 2021 at 9:59 AM Florian Fainelli <f.fainelli@xxxxxxxxx> wrote: >> >> On 9/16/21 7:27 PM, Saravana Kannan wrote: >>> On Thu, Sep 16, 2021 at 7:21 PM Florian Fainelli <f.fainelli@xxxxxxxxx> wrote: >>>> >>>> >>>> >>>> On 9/15/2021 1:19 AM, Saravana Kannan wrote: >>>>> This reverts commit cf4b94c8530d14017fbddae26aad064ddc42edd4. >>>>> >>>>> Some PHYs pointed to by "phy-handle" will never bind to a driver until a >>>>> consumer attaches to it. And when the consumer attaches to it, they get >>>>> forcefully bound to a generic PHY driver. In such cases, parsing the >>>>> phy-handle property and creating a device link will prevent the consumer >>>>> from ever probing. We don't want that. So revert support for >>>>> "phy-handle" property until we come up with a better mechanism for >>>>> binding PHYs to generic drivers before a consumer tries to attach to it. >>>>> >>>>> Signed-off-by: Saravana Kannan <saravanak@xxxxxxxxxx> >>>> >>>> Thanks for getting this revert submitted, I just ran a bisection this >>>> afternoon that pointed to this offending commit. It would cause the dead >>>> lock >>> >>> Dead lock in the kernel? Or do you mean just a hang waiting forever for network? >> >> It locks up since we try to acquire __device_driver_lock() while we are >> in the main driver's probe function. > > Off the top of my head that sounds weird, but I'll look into the > logs/code later. Bunch of other stuff and some LPC prep comes first :) > >> >>> >>>> on boot with drivers/net/dsa/bcm_sf2.c pasted below. >>> >>> The log is too jumbled up to be readable (word wrap I suppose) and >>> maybe even multiple thread printing at the same time. >> >> Re-attached (thunderbird is not really helping me). > > Thanks! > >> >>> >>>> Saravana, can >>>> you CC on me on your fix or what you would want me to be testing? >>> >>> By fix, I assume you mean when I bring back phy-handle with a proper >>> fix to handle the case in the commit text? Yeah, that's going to take >>> a while. It's brewing in my head and there are some issues that's not >>> fully resolved. But I haven't had time to code it up or dig into the >>> net code to make sure it'll work. But yes, I'll CC you when I do so >>> you can test it with this case. >> >> Well by fix, I meant something that does not lock up on my system, > > Hold on. Now I'm confused. Are you still hitting hangs/issues after > the phy-handle patch is reverted? With the 'phy-handle' patch reverted no, I do not see the hang. > >> it is >> a different problem from supporting 'phy-handle', but it should not >> regress an existing system, no matter how quirky that system behaves in >> its probe function. For history and reference, the "offending" change >> and background can be found here: >> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=771089c2a485958e423f305e974303760167b45c > > So that is the change that's interacting with the phy-handle patch to > cause the deadlock? > > I'm a bit confused on what needs debugging now. Sorry was not very clear, what I meant to write is that your next attempt at solving 'phy-handle' should not be regressing on my system, and I will make sure you get appropriate testing of that patch. -- Florian