Hi Saravana, On 5/24/19 9:04 PM, Saravana Kannan wrote: > On Fri, May 24, 2019 at 7:40 PM Frank Rowand <frowand.list@xxxxxxxxx> wrote: >> >> Hi Saranova, >> >> I'll try to address the other portions of this email that I <snipped> >> in my previous replies. >> >> >> On 5/24/19 2:53 PM, Saravana Kannan wrote: >>> On Fri, May 24, 2019 at 10:49 AM Frank Rowand <frowand.list@xxxxxxxxx> wrote: >>>> >>>> On 5/23/19 6:01 PM, Saravana Kannan wrote: >>>>> Add a generic "depends-on" property that allows specifying mandatory >>>>> functional dependencies between devices. Add device-links after the >>>>> devices are created (but before they are probed) by looking at this >>>>> "depends-on" property. >>>>> >>>>> This property is used instead of existing DT properties that specify >>>>> phandles of other devices (Eg: clocks, pinctrl, regulators, etc). This >>>>> is because not all resources referred to by existing DT properties are >>>>> mandatory functional dependencies. Some devices/drivers might be able>>>>> to operate with reduced functionality when some of the resources In your original email, you say this: >>>>> aren't available. For example, a device could operate in polling mode >>>>> if no IRQ is available, a device could skip doing power management if >>>>> clock or voltage control isn't available and they are left on, etc. >>>>> >>>>> So, adding mandatory functional dependency links between devices by >>>>> looking at referred phandles in DT properties won't work as it would >>>>> prevent probing devices that could be probed. By having an explicit >>>>> depends-on property, we can handle these cases correctly. >>>> >>>> Trying to wrap my brain around the concept, this series seems to be >>>> adding the ability to declare that an apparent dependency (eg an IRQ >>>> specified by a phandle) is _not_ actually a dependency. >>> >>> The current implementation completely ignores existing bindings for >>> dependencies and so does the current tip of the kernel. So it's not >>> really overriding anything. However, if I change the implementation so >>> that depends-on becomes the source of truth if it exists and falls >>> back to existing common bindings if "depends-on" isn't present -- then >>> depends-on would truly be overriding existing bindings for >>> dependencies. It depends on how we want to define the DT property. >>> >>>> The phandle already implies the dependency. >>> >>> Sure, it might imply, but it's not always true. >>> >>>> Creating a separate >>>> depends-on property provides a method of ignoring the implied >>>> dependencies. >>> >>> implied != true >>> I refer to your irq mode vs polled mode device example: >>>> This is not just hardware description. It is instead a combination >>>> of hardware functionality and driver functionality. An example >>>> provided in the second paragraph of the email I am replying to >>>> suggests a device could operate in polling mode if no IRQ is >>>> available. Using this example, the devicetree does not know >>>> whether the driver requires the IRQ (currently an implied >>>> dependency since the IRQ phandle exists). My understanding >>>> of this example is that the device node would _not_ have a >>>> depends-on property for the IRQ phandle so the IRQ would be >>>> optional. But this is an attribute of the driver, not the >>>> hardware. >>> >> You change the subject from irq mode vs polled mode device to some other type of device: >>> Not really. The interrupt could be for "SD card plugged in". That's >>> never a mandatory dependency for the SD card controller to work. So >>> the IRQ provider won't be a "depends-on" in this case. But if there is >>> no power supply or clock for the SD card controller, it isn't going to >>> work -- so they'd be listed in the "depends-on". So, this is still >>> defining the hardware and not the OS. >> I again try to get you to discuss the irq mode vs polled mode device: >> Please comment on my observation that was based on an IRQ for a device >> will polling mode vs interrupt driven mode. >> You described a different >> case and did not address my comment. > > I thought I did reply -- not sure what part you are looking for so > I'll rephrase. I was just picking the SD card controller as a concrete > example of device that can work with or without an interrupt. But > sure, I can call it "the device". > And the thread is so deeply nested that you are missing the original point that I made. > And yes, the device won't have a "depends-on" on the IRQ provider > because the device can still work without a working (as in bound to > driver) IRQ provider. Whether the driver insists on waiting on an IRQ > provider or not is up to the driver and the depends-on property is NOT > trying to dictate what the driver should do in this case. Does that > answer your implied question? If the device _must_ operate in irq mode to achieve the throughput that is _required_ for the system to be functional then that system would need a devicetree to have a "depends-on" property for the irq. But another system using the same exact hardware might be able to tolerate the reduced throughput of operating in polled mode. This second system would need a devicetree that does _not_ have a depends-on property for that same irq, as used by that same device. This then becomes configuration, not hardware description, as noted in my next paragraph: > >> >>>> This is also configuration, declaring whether the >>>> system is willing to accept polling mode instead of interrupt >>>> mode. -Frank >>> >>> Whether the driver will choose to operate without the IRQ is up to it. >>> The OS could also assume the power supply is never turned off and >>> still try to use the device. Depending on the hardware configuration, >>> that might or might not work. >>> >>>> Devicetree is not the proper place for driver description or >>>> for configuration. >>> >>> But depends-on isn't describing the driver configuration though. >>> >>> Overall, the clock provider example I gave in another reply is a much >>> better example. If you just assume implied dependencies are mandatory >>> dependencies, some devices will never be probe because the kernel is >>> using them incorrectly (they aren't meant to list mandatory >>> dependencies). >>> >>>> Another flaw with this method is that existing device trees >>>> will be broken after the kernel is modified, because existing >>>> device trees do not have the depends-on property. This breaks >>>> the devicetree compatibility rules. >>> >>> This is 100% not true with the current implementation. I actually >>> tested this. This is fully backwards compatible. That's another reason >>> for adding depends-on and going by just what it says. The existing >>> bindings were never meant to describe only mandatory dependencies. So >>> using them as such is what would break backwards compatibility. >>> >>>>> Having functional dependencies explicitly called out in DT and >>>>> automatically added before the devices are probed, provides the >>>>> following benefits: >>>>> >>>>> - Optimizes device probe order and avoids the useless work of >>>>> attempting probes of devices that will not probe successfully >>>>> (because their suppliers aren't present or haven't probed yet). >>>>> >>>>> For example, in a commonly available mobile SoC, registering just >>>>> one consumer device's driver at an initcall level earlier than the >>>>> supplier device's driver causes 11 failed probe attempts before the >>>>> consumer device probes successfully. This was with a kernel with all >>>>> the drivers statically compiled in. This problem gets a lot worse if >>>>> all the drivers are loaded as modules without direct symbol >>>>> dependencies. >>>>> >>>>> - Supplier devices like clock providers, regulators providers, etc >>>>> need to keep the resources they provide active and at a particular >>>>> state(s) during boot up even if their current set of consumers don't >>>>> request the resource to be active. This is because the rest of the >>>>> consumers might not have probed yet and turning off the resource >>>>> before all the consumers have probed could lead to a hang or >>>>> undesired user experience. >>>>> >>>>> Some frameworks (Eg: regulator) handle this today by turning off >>>>> "unused" resources at late_initcall_sync and hoping all the devices >>>>> have probed by then. This is not a valid assumption for systems with >>>>> loadable modules. Other frameworks (Eg: clock) just don't handle >>>>> this due to the lack of a clear signal for when they can turn off >>>>> resources. This leads to downstream hacks to handle cases like this >>>>> that can easily be solved in the upstream kernel. >>>>> >>>>> By linking devices before they are probed, we give suppliers a clear >>>> >>>> By linking devices to suppliers before they are probed, we give suppliers a clear >>> >>> Ack >>> >>>>> count of the number of dependent consumers. Once all of the >>>>> consumers are active, the suppliers can turn off the unused >>>>> resources without making assumptions about the number of consumers. >>>>> >>>>> By default we just add device-links to track "driver presence" (probe >>>>> succeeded) of the supplier device. If any other functionality provided >>>>> by device-links are needed, it is left to the consumer/supplier >>>>> devices to change the link when they probe. >>>>> >>>>> >>>>> Saravana Kannan (5): >>>>> of/platform: Speed up of_find_device_by_node() >>>>> driver core: Add device links support for pending links to suppliers >>>>> dt-bindings: Add depends-on property >>>>> of/platform: Add functional dependency link from "depends-on" property >>>>> driver core: Add sync_state driver/bus callback >>>>> >>>>> .../devicetree/bindings/depends-on.txt | 26 +++++ >>>>> drivers/base/core.c | 106 ++++++++++++++++++ >>>>> drivers/of/platform.c | 75 ++++++++++++- >>>>> include/linux/device.h | 24 ++++ >>>>> include/linux/of.h | 3 + >>>>> 5 files changed, 233 insertions(+), 1 deletion(-) >>>>> create mode 100644 Documentation/devicetree/bindings/depends-on.txt >>>>> >>>> >>>> The above issues make this specific implementation not acceptable. >>>> I like the analysis of the problem areas, and I like the concepts of >>>> trying to solve not only probe ordering, but also the problem of >>>> when to turn off resources that will not be needed. >>> >>> Beating a dead horse here, but I want to make sure I get this into as >>> many minds as possible: >>> It is NOT just about turning off resources. It's about the kernel >>> taking full control of resources (allowing the full range of voltages, >>> clock frequencies, bus configurations, etc) and syncing the HW state >>> to the SW state as determined by the consumers. >>> >>>> But at the >>>> moment, I don't have a suggestion of a way to implement a solution. >>> >>> The problem of syncing resources to SW state after boot up completed >>> has been broken for a long time in the kernel. It's high time we fix >>> it. I'm open to other suggestions, but we can't just say "we don't >>> have a solution". >>> >>> For example, we can have a kernel command line argument that'll use >>> all common implicit bindings as mandatory dependencies and allow >>> "depends-on" to override them for the few cases where the implicit >>> dependencies don't match mandatory dependencies. Then: >>> - The kernel will be 100% backwards compatible with existing DT if the >>> command line arg isn't provided. >>> - New DT + old kernel will be no worse than today because old kernel >>> doesn't do any dependency tracking. >>> - Command line arg + new kernel + hardware where all implicit >>> dependencies are actually mandatory dependencies == things work >>> better. >>> - Command line arg + new kernel + hardware where all implicit >>> dependencies don't match mandatory dependencies + depends-on for those >>> exception case == things work better. >> >> Using a command line argument for this purpose just seems to be a >> hack and bad architecture. > > I agree -- which is why the current implementation doesn't try to form > mandatory dependency from implicit bindings and makes the case that > all mandatory dependencies should be called out explicitly. But if > people insist on that implicit bindings be used as such, then a CONFIG > or commandline option would be an acceptable compromise for me. > >> It also seems like an invitation to mis-configure a system (in other >> words, increases the complexity and difficulty of properly configuring >> and administering a system). > > Just want to point out that, as of today, we have a broken system -- > module loading is not compatible with proper handling of shared > mandatory resources during boot up. To be clear, I'm not arguing for a > commandline arg. > >> The is not a hard no (yet), but it will take some convincing for me >> to accept the command line approach to add the feature, yet maintain >> compatibility. Please do not spend any time replying to this concern >> yet - we will have plenty of time to discuss later if need be. > > Sounds good. > > -Saravana >