Hello Grant and Rob, I am writing this email to you specifically to ask your technical assessment with respect to representing device thermal limits as device tree nodes. I am proposing to introduce device tree nodes to describe these limits as thermal zones, their composition and their relations with cooling devices and other thermal zones (thermal data). As you should know, device thermal limits are part of hardware specification. Considering your board layout, mechanics, power dissipation and composition of ICs, etc, that will impose thermal requirements on your system, and infringing these limits can lead to device damage, device life time reduction or even end user harm. Thus, the thermal data help to describe the hardware limits and what needs to be done if those limits are crosses, as part of your board design and non-functional requirements. Obviously that is very dependent on your hardware, and not all of them will have these non-functional requirements. Besides, describing these limits has *nothing* to do with how you actually find these limits. In any case, there is a need to properly represent these requirements and I am proposing to have this representation in device tree. There were already couple of counter-arguments claiming this is actually about configuration and performance profile description. But I still stand against these two readings of this proposal and again state that if one interprets it as configuration or performance profile, that is a mis-understanding [0]. Let me state it clear (again [1]), my proposal is to describe hardware thermal limits, because these limits are part of a hardware specification; representing in device tree would not infringe the original purpose of this data structure ("The Device Tree is a data structure for describing hardware."[2]). Before I explain my proposal, I want to highlight also that these data is represented elsewhere already and it is reused across different OS's. Thermal data is described using ACPI [3] and operating systems ACPI-aware do support the interpretation of thermal data. Linux is one example of such systems (I believe I do not need to enlist here all systems supporting ACPI). On the other hand, not all systems have ACPI or are specified to use ACPI. Thus, here is another reason to represent properly thermal data, so that we can scale across systems. In the specific case of Linux, the common thermal concepts between ACPI systems and non-ACPI systems have been represented in the thermal framework (CONFIG_THERMAL). Today, on ACPI systems, thermal data is fetched from bootloader with help from the common ACPI parser. For non-ACPI systems, the thermal data is actually coded as part of device drivers. So, to the point, a brief explanation of my proposal goes as follows: i - trip points: a node to describe a point in the temperature domain in which the system has to take an action. This node describes just the point, not the action. Properties here are temperature, hysteresis, and type (critical, hot, passive, active, etc). ii - binding parameters: the bind_param node is a node to describe how actions (cooling devices) get assigned to trip points. Cooling devices are expected to be loaded in the target system. Properties here are: cooling device name, weight, trip_mask and limits. iii - thermal zones: the thermal_zone node is the node containing all the required info for describing a thermal zone with hardware thermal limitation, including its bindings with cooling devices. Properties here are: type, passive_delay, polling_delay, governor. The thermal_zone node must contain, apart from its own properties, one node containing trip nodes and one node containing all the zone bind parameters. Here is an example (on OMAP4430): thermal_zone { type = "CPU"; mask = <0x03>; /* trips writability */ passive_delay = <250>; /* milliseconds */ polling_delay = <1000>; /* milliseconds */ governor = "step_wise"; trips { alert@100000{ temperature = <100000>; /* milliCelsius hysteresis = <2000>; /* milliCelsius */ type = <THERMAL_TRIP_PASSIVE>; }; crit@125000{ temperature = <125000>; /* milliCelsius hysteresis = <2000>; /* milliCelsius */ type = <THERMAL_TRIP_CRITICAL>; }; }; bind_params { action@0{ cooling_device = "thermal-cpufreq"; weight = <100>; /* percentage */ mask = <0x01>; /* no limits, using defaults */ }; }; }; In this current proposal, a 'thermal_zone' node would be embedded inside a temperature sensor node, for simplicity. But other possible builds could embedded them in the device with thermal limits (CPU nodes, for instance) or they could be not embedded in any specific node. A full documented description can be found here [4]. Also a branch containing: (a) needed changes in order to have this DT parser; (b) the DT parser with documentation (c) examples on how drivers could be changes to use the parser can be found in my branch here [5]. I wrote the thermal DT parser to build thermal zones with the thermal framework API. However, if one does not want to do that, it can simple do not include a CONFIG_THERMAL_OF=y in her/his build, and the calls will be translated to nops, and the device tree thermal data can be parsed to somewhere else interested (other subsystem or even user land). A TODO on this implementation is that it still lacks the representation of thermal zones composed by several sensors. However, I believe it is better to take an incremental approach here. This series can already be used to improve most of the existing platform thermal drivers (most are CPU thermal drivers) and to reuse the existing code of some hwmon sensors to build thermal zones for board thermal requirements. I have already posted a patch series with this proposal on [6], that contains a reference for the original RFC. But looks like my messages got moderated on device tree mailing list. Obviously, within PM forum, feedback was quite positive. However, we cannot proceed without proper assessment of other subsystems. lm-sensors folks (Guenter) seam to be strongly against this series, as there is a fear that this may introduce a mis-usage of DT. I still believe this is needed for hardware description, and thus not a infringement on DT purposes. Please let me know your thoughts on this topic and apologize me if my previous messages on this topic did not reach you (hope they reach now). All best, Eduardo Valentin [0] - https://lkml.org/lkml/2013/7/17/621 [1] - https://lkml.org/lkml/2013/7/18/279 [2] - www.devicetree.org [3] - http://www.acpi.info/ [4] - https://git.kernel.org/cgit/linux/kernel/git/evalenti/linux.git/diff/Documentation/devicetree/bindings/thermal/thermal.txt?h=thermal_work/thermal_core/dt_parser&id=405bf0b51457ed055a082af2653d7ce757bc2e91 [5] - https://git.kernel.org/cgit/linux/kernel/git/evalenti/linux.git/log/?h=thermal_work/thermal_core/dt_parser [6] - https://lkml.org/lkml/2013/7/17/923 -- You have got to be excited about what you are doing. (L. Lamport) Eduardo Valentin
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ lm-sensors mailing list lm-sensors@xxxxxxxxxxxxxx http://lists.lm-sensors.org/mailman/listinfo/lm-sensors