On Fri, Feb 12, 2021 at 12:15 AM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote: > > Hi Saravana, > > On Fri, Feb 12, 2021 at 4:00 AM Saravana Kannan <saravanak@xxxxxxxxxx> wrote: > > On Thu, Feb 11, 2021 at 5:00 AM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote: > > > 1. R-Car Gen2 (Koelsch), R-Car Gen3 (Salvator-X(S), Ebisu). > > > > > > - Commit 2dfc564bda4a31bc ("soc: renesas: rcar-sysc: Mark device > > > node OF_POPULATED after init") is no longer needed (but already > > > queued for v5.12 anyway) > > > > Rob doesn't like the proliferation of OF_POPULATED and we don't need > > it anymore, so maybe work it out with him? It's a balance between some > > wasted memory (struct device(s)) vs not proliferating OF_POPULATED. > > Rob: should it be reverted? For v5.13? > I guess other similar "fixes" went in in the mean time. > > > > - Some devices are reprobed, despite their drivers returning > > > a real error code, and not -EPROBE_DEFER: > > > > Sorry, it's not obvious from the logs below where "reprobing" is > > happening. Can you give more pointers please? > > My log was indeed not a full log, but just the reprobes happening. > I'll send you a full log by private email. > > > Also, thinking more about this, the only way I could see this happen is: > > 1. Device fails with error that's not -EPROBE_DEFER > > 2. It somehow gets added to a device link (with AUTOPROBE_CONSUMER > > flag) where it's a consumer. > > 3. The supplier probes and the device gets added to the deferred probe > > list again. > > > > But I can't see how this sequence can happen. Device links are created > > only when a device is added. And is the supplier isn't added yet, the > > consumer wouldn't have probed in the first place. > > The full log doesn't show any evidence of the device being added > to a list in between the two probes. > > > Other than "annoying waste of time" is this causing any other problems? > > Probably not. But see below. > > > > - The PCI reprobing leads to a memory leak, for which I've sent a fix > > > "[PATCH] PCI: Fix memory leak in pci_register_io_range()" > > > https://lore.kernel.org/linux-pci/20210202100332.829047-1-geert+renesas@xxxxxxxxx/ > > > > Wrt PCI reprobing, > > 1. Is this PCI never expected to probe, but it's being reattempted > > despite the NOT EPROBE_DEFER error? Or > > There is no PCIe card present, so the failure is expected. > Later it is reprobed, which of course fails again. > > > 2. The PCI was deferred probe when it should have probed and then when > > it's finally reattemped and it could succeed, we are hitting this mem > > leak issue? > > I think the leak has always been there, but it was just exposed by > this unneeded reprobe. I don't think a reprobe after that specific > error path had ever happened before. > > > I'm basically trying to distinguish between "this stuff should never > > be retried" vs "this/it's suppliers got probe deferred with > > fw_devlink=on vs but didn't get probe deferred with > > fw_devlink=permissive and that's causing issues" > > There should not be a probe deferral, as no -EPROBE_DEFER was > returned. > > > > - I2C on R-Car Gen3 does not seem to use DMA, according to > > > /sys/kernel/debug/dmaengine/summary: > > > > > > -dma4chan0 | e66d8000.i2c:tx > > > -dma4chan1 | e66d8000.i2c:rx > > > -dma5chan0 | e6510000.i2c:tx > > > > I think I need more context on the problem before I can try to fix it. > > I'm also very unfamiliar with that file. With fw_devlink=permissive, > > I2C was using DMA? If so, the next step is to see if the I2C relative > > probe order with DMA is getting changed and if so, why. > > Yes, I plan to dig deeper to see what really happens... Try fw_devlink.strict (you'll need IOMMU enabled too). If that fixes it and you also don't see this issue with fw_devlink=permissive, then it means there's probably some unnecessary probe deferral that we should try to avoid. At least, that's my hunch right now. Thanks, Saravana > > > > - On R-Mobile A1, I get a BUG and a memory leak: > > > > > > BUG: spinlock bad magic on CPU#0, swapper/1 > > > > > Hmm... I looked at this in bits and pieces throughout the day. At > > least spent an hour looking at this. This doesn't make a lot of sense > > to me. I don't even touch anything in this code path AFAICT. Are > > modules/kernel mixed up somehow? I need more info before I can help. > > Does reverting my pm domain change make any difference (assume it > > boots this far without it). > > I plan to dig deeper to see what really happens... > > Gr{oetje,eeting}s, > > Geert > > -- > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx > > In personal conversations with technical people, I call myself a hacker. But > when I'm talking to journalists I just say "programmer" or something like that. > -- Linus Torvalds