On Mon, Jun 17, 2019 at 8:44 AM Georgi Djakov <georgi.djakov@xxxxxxxxxx> wrote: > > Hi Saravana, > > On 6/14/19 07:17, Saravana Kannan wrote: > > Add a icc_create_devfreq() and icc_remove_devfreq() to create and remove > > devfreq devices for interconnect paths. A driver can create/remove devfreq > > devices for the interconnects needed for its device by calling these APIs. > > This would allow various devfreq governors to work with interconnect paths > > and the device driver itself doesn't have to actively manage the bandwidth > > votes for the interconnects. > > Thanks for the patches, but creating devfreq devices for each interconnect path > seems odd to me - at least for consumers that already use a governor. Each governor instance always handles one "frequency" (more like performance) domain at a time. So if a consumer is already using a governor to scale the hardware block, then using another governor to scale the interconnect performance points is the right way to go about it. In fact, that's exactly what devfreq passive governor's documentation even says it's meant for. That's also what cpufreq does for each cluster/CPU frequency domain too. > So for DDR > scaling for example, are you suggesting that we add a devfreq device from the > cpufreq driver in order to scale the interconnect between CPU<->DDR? Yes in general. Although, CPUs are a special case because CPUs don't go through devfreq. So passive governor as it stands today won't work. CPU<->DDR scaling might need a separate governor (unlikely) or some changes to the passive governor that I'm happy to work on once we settle this for general devices like GPU, etc. But the DT format for CPUs will be identical to GPUs or any other device. > Also if the > GPU is already using devfreq, should we add a devfreq per each interconnect > path? What would be the benefit in this case - using different governors for > bandwidth scaling maybe? When saying "separate/different governors" in this email, I mean both different instance of the same governor logic with different tunables AND actually different algorithms/governor logic entirely. The heuristics to use for each interconnect path might be (more like, will be) different based on hardware characteristics (Eg: what voltage domains the interconnect is sitting on) and what interconnect information is available (Eg: Just busy time vs bandwidth count vs no information etc) -- so having separate governors for each interconnect path makes a lot of sense. It also allows userspace to control the policy for scaling each of those paths based on product use cases. For example, when the GPU is just doing simple UI rendering, userspace can use the max_freq sysfs file for the devfreq device to disallow high bandwidth OPPs on the GPU<->DDR path, but those higher OPPs could be allowed by userspace when the GPU is used for games. Having devfreq device for each interconnect path also make it easy to debug performance issues -- you can independently change the votes for each path to figure out what is causing the bottleneck, etc. Adding a devfreq device for interconnect voting with a few lines gives all these features "for free". This doesn't mean all users of interconnect framework NEED to use devfreq for interconnect. They might do it simply based on calculations based on the use case (Eg: display driver from display resolution). But if they are trying to use any kind of algorithm/heuristics, writing it as a devfreq governor should be encouraged. Also want to point out that BW OPPs also work for drivers that don't use devfreq at all. The interconnect-opp-table just lists the meaningful OPP leveld for the path and the device driver can pick one entry from the table based on the use case. Thanks, Saravana