Hi Grant, > On Apr 2, 2015, at 05:38 , Grant Likely <grant.likely@xxxxxxxxxxxx> wrote: > > On Mon, 30 Mar 2015 15:27:27 +0200 > , Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> > wrote: >> Hi Pantelis, >> >> On Tue, Mar 24, 2015 at 6:56 PM, Pantelis Antoniou >> <pantelis.antoniou@xxxxxxxxxxxx> wrote: >>>> On Mar 24, 2015, at 07:50 , Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote: >>>> On Fri, Mar 20, 2015 at 12:39 PM, Pantelis Antoniou >>>> <pantelis.antoniou@xxxxxxxxxxxx> wrote: >>>>>> On Mar 19, 2015, at 21:18 , Grant Likely <grant.likely@xxxxxxxxxxxx> wrote: >>>>>> On Tue, 16 Dec 2014 14:11:31 +0200 >>>>>> , Pantelis Antoniou <pantelis.antoniou@xxxxxxxxxxxx> >>>>>> wrote: >>>>>>> A nice side-effect of the changes in DTC for supporting overlays >>>>>>> is that it is now possible to do dependency tracking of platform >>>>>>> devices automatically. >>>>>>> >>>>>>> This patch implements dependency aware probe order for users >>>>>>> of of_platform_populate. >>>>>>> >>>>>>> There are no changes in the syntax of the DTS bindings, the >>>>>>> dependency is generated automatically by the use of phandle >>>>>>> references. >>>>>> >>>>>> Do you have measurements showing improvement? Conceptually, I don't have >>>>>> a problem with having a small scale solution like this, but I want proof >>>>>> that it actively makes things better, and is worth the extra complexity. >>>>>> It's not an easy block of code to understand. >>>>> >>>>> I will be the first to admit that the code it’s a bit hard to follow, but >>>>> that’s the nature of trees and recursion. >>>>> >>>>> FWIW I’ve been booting with this applied for a month with no adverse effects, >>>>> besides the fact that there dependency cycles which I would file as a bug. >>>> >>>> IIUC, this would fix the issue I worked around in "ARM: shmobile: r8a73a4: >>>> Move pfc node to work around probe ordering bug"? >>>> https://git.kernel.org/cgit/linux/kernel/git/horms/renesas.git/commit/?h=r8a73a4-ccf-and-multiplatform-for-v4.1&id=e4ba0a9bddff3ba52cec100414d2f178440efc91 >>> >>> A victim^Wtester! Yeah, it’s supposed to do that, but no guarantees, this is an RFC. >>> >>> I would be happy to know if it solves your problem or not, please keep me in the loop. >>> >>>> I'll give it a try when I'm back from ELC... >> >> And so I did. I >> 1. Moved the pfc node in arch/arm/boot/dts/r8a73a4.dtsi back between dmac >> and i2c5. >> 2. Applied your patch, >> 3. Disabled the check for "/__local_fixups__" in >> __of_platform_populate_scan(). >> 4. Added support for "interrupts-extended" (gmail-whitespace-damaged patch >> at the end of this email), >> 5. Enabled DEBUG, >> 6. Uncommented all your debug pr_info() calls (except the one for >> "interrupts", as it references non-existing variables!). >> >> But it didn't make any difference. The debug log below shows that it detects >> the dependency of pfc@e6050000 on irqc0 (interrupt-controller@e61c0000) >> irqc1 (interrupt-controller@e61c0200), but it doesn't reorder the nodes, >> so pfc is still initialized first: > > I talked to Pantelis about this while we were at ELC. Even if it did > correctly reorder the registration and solved your problem, it won't > help at all for most devices registered by the of_platform_populate() > call. > > The core device model is oportunistic about when drivers get bound to > devices. Both drivers and devices can be registered at any time. When a > device is registered, it kernel tries to bind it against all available > drivers, and similarly when a driver is registered the kernel tries to > bind it with all unbound devices. > > In the normal case, of_platform_populate() is called at arch_initcall > (level 3). Most drivers are registered at device_initcall (level 6). > The only reason reordering works for you is because both the renesas pfc > irqc drivers use postcore_initcall (level 2). Therefore, both drivers > get registered before of_platform_populate() which makes device > registration order matter. If the devices were registered before the > drivers, then it would be kernel link order which determines the > behaviour. > > There are two ways to fix this so that .dtb order doesn't matter. The > dirty hack is to change the pfc driver to use subsys_initcall (level 4) > or later so that it happens after the devices are registered. The second > solution is to make the pfc drivers able to return -EPROBE_DEFER, but > that also requires fixing deferred probe to start retrying devices > before late_initcall time. Right now there is a holdoff flag that > prevents any retries before late_initcall(). It was done that way > because the defer workqueue conflicted with some of the initcalls in a > bad way. > > We could add a manual walk of the queue between each initcall level > without using the workqueue. That would solve the problem. > I’ve been spending some time getting things to work properly taking into account the essentially random way that drivers are registered. My complete understanding is this: The platform devices that are part of any directly reachable from root simple busses are registered sometime on init_machine() at arch_initcall() time. The drivers are registered at device_initcall() but not all of them. Some of them are _required_ to be initialized earlier in order for the devices to be instantiated first. For instance dma engine drivers. I am toying with marking drivers that have been matched with devices and not instantiating them immediately, and that might work. Perhaps an easier solution would be just to delay device register to be performed at the the end device_initcall() since all the drivers would be registered and the device matched immediately. I don’t know if that will make things go kaboom on some $RANDOM_ARCH though. Does anything come to mind? IMHO getting this right might make the holdoff hacks go away completely. > g. Regards — Pantelis -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html