On Tue, 16 Jul 2019 at 02:56, Saravana Kannan <saravanak@xxxxxxxxxx> wrote: > > On Mon, Jul 15, 2019 at 1:16 AM Vincent Guittot > <vincent.guittot@xxxxxxxxxx> wrote: > > > > On Tue, 9 Jul 2019 at 21:03, Saravana Kannan <saravanak@xxxxxxxxxx> wrote: > > > > > > On Tue, Jul 9, 2019 at 12:25 AM Vincent Guittot > > > <vincent.guittot@xxxxxxxxxx> wrote: > > > > > > > > On Sun, 7 Jul 2019 at 23:48, Saravana Kannan <saravanak@xxxxxxxxxx> wrote: > > > > > > > > > > On Thu, Jul 4, 2019 at 12:12 AM Vincent Guittot > > > > > <vincent.guittot@xxxxxxxxxx> wrote: > > > > > > > > > > > > On Wed, 3 Jul 2019 at 23:33, Saravana Kannan <saravanak@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > On Tue, Jul 2, 2019 at 11:45 PM Vincent Guittot > > > > > > > <vincent.guittot@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > On Wed, 3 Jul 2019 at 03:10, Saravana Kannan <saravanak@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > Interconnect paths can have different performance points. Now that OPP > > > > > > > > > framework supports bandwidth OPP tables, add OPP table support for > > > > > > > > > interconnects. > > > > > > > > > > > > > > > > > > Devices can use the interconnect-opp-table DT property to specify OPP > > > > > > > > > tables for interconnect paths. And the driver can obtain the OPP table for > > > > > > > > > an interconnect path by calling icc_get_opp_table(). > > > > > > > > > > > > > > > > The opp table of a path must come from the aggregation of OPP tables > > > > > > > > of the interconnect providers. > > > > > > > > > > > > > > The aggregation of OPP tables of the providers is certainly the > > > > > > > superset of what a path can achieve, but to say that OPPs for > > > > > > > interconnect path should match that superset is an oversimplification > > > > > > > of the reality in hardware. > > > > > > > > > > > > > > There are lots of reasons an interconnect path might not want to use > > > > > > > all the available bandwidth options across all the interconnects in > > > > > > > the route. > > > > > > > > > > > > > > 1. That particular path might not have been validated or verified > > > > > > > during the HW design process for some of the frequencies/bandwidth > > > > > > > combinations of the providers. > > > > > > > > > > > > All these constraint are provider's constraints and not consumer's one > > > > > > > > > > > > The consumer asks for a bandwidth according to its needs and then the > > > > > > providers select the optimal bandwidth of each interconnect after > > > > > > aggregating all the request and according to what OPP have been > > > > > > validated > > > > > > > > > > Not really. The screening can be a consumer specific issue. The > > > > > consumer IP itself might have some issue with using too low of a > > > > > bandwidth or bandwidth that's not within some range. It should not be > > > > > > > > How can an IP ask for not enough bandwidth ? > > > > It asks the needed bandwidth based on its requirements > > > > > > The "enough bandwidth" is not always obvious. It's only for very > > > simple cases that you can calculate the required bandwidth. Even for > > > cases that you think might be "obvious/easy" aren't always easy. > > > > > > For example, you'd think a display IP would have a fixed bandwidth > > > requirement for a fixed resolution screen. But that's far from the > > > truth. It can also change as the number of layers change per frame. > > > For video decoder/encoder, it depends on how well the frames compress > > > with a specific compression scheme. > > > So the "required" bandwidth is often a heuristic based on the IP > > > frequency or traffic measurement. > > > > > > But that's not even the point I was making in this specific "bullet". > > > > > > A hardware IP might be screen/verified with only certain bandwidth > > > levels. Or it might have hardware bugs that prevent it from using > > > lower bandwidths even though it's technically sufficient. We need a > > > way to capture that per path. This is not even a fictional case. This > > > has been true multiple times over widely used IPs. > > > > here you are mixing HW constraint on the soc and OPP screening with > > bandwidth request from consumer > > ICC framework is about getting bandwidth request not trying to fix > > some HW/voltage dependency of the SoC > > > > > > > > > > the provider's job to take into account all the IP that might be > > > > > connected to the interconnects. If the interconnect HW itself didn't > > > > > > > > That's not what I'm saying. The provider knows which bandwidth the > > > > interconnect can provide as it is the ones which configures it. So if > > > > the interconnect has a finite number of bandwidth point based probably > > > > on the possible clock frequency and others config of the interconnect, > > > > it selects the best final config after aggregating the request of the > > > > consumer. > > > > > > I completely agree with this. What you are stating above is how it > > > should work and that's the whole point of the interconnect framework. > > > > > > But this is orthogonal to the point I'm making. > > > > It's not orthogonal because you want to add a OPP table pointer in the > > ICC path structure to fix your platform HW constraint whereas it's not > > the purpose of the framework IMO > > > > > > > > > > change, the provider driver shouldn't need to change. By your > > > > > definition, a provider driver will have to account for all the > > > > > possible bus masters that might be connected to it across all SoCs. > > > > > > > > you didn't catch my point > > > > > > Same. I think we are talking over each other. Let me try again. > > > > > > You are trying to describe how and interconnect provider and framework > > > should work. There's no disagreement there. > > > > > > My point is that consumers might not want to or can not always use all > > > the available bandwidth levels offered by the providers. There can be > > > many reasons for that (which is what I listed in my earlier emails) > > > and we need a good and generic way to capture that so that everyone > > > isn't trying to invent their own property. > > > > And my point is that you want to describe some platform or even UCs > > specific constraint in the ICC framework which is not the place to do. > > > > If the consumers might not want to use all available bandwidth because > > this is not power efficient as an example, this should be describe > > somewhere else to express that there is a shared power domain > > between some devices and we shoudl ensure that all devices in this > > power domain should use the Optimal Operating Point (optimal freq for > > a voltage) > > My patch series has nothing to do with shared power domains. I think > the examples have made it amply clear. It's far from being clear why a consumer doesn't want to use some bandwidth level TBH Do you have a real example ? > > > ICC framework describes the bandwidth request that are expressed by > > the consumers for the current running state of their IP but it doesn't > > reflect the fact that on platform A, the consumer should use bandwidth > > X because it will select a voltage level of a shared power domain that > > is optimized for the other devices B, C ... . It's up to the provider > > to know HW details of the bus that it drives and to make such > > decision; the consumer should always request the same > > The change to ICC framework is practically just this. I don't have any > future changes planned for the ICC framework. This is the entirety of > it. > > + opp_node = of_parse_phandle(np, "interconnect-opp-table", idx); > + if (opp_node) { > + path->opp_table = dev_pm_opp_of_find_table_from_node(opp_node); > + of_node_put(opp_node); > + } > > It's quite a stretch and bit hyperbolic to say this one change is > getting ICC framework to do all the things you claim above. > So I clearly don't see the benefit of adding this opp_table field in icc_path struct because I'm still convinced that the consumer doesn't have to get a bandwidth table like that. If the consumer already get a bandwidth value and it should in order to call icc_set or even in order to select one element in your table, then it should directly set it with icc_set and let the provider aggregate and choose the real one > It's literally a simple helper function so that the consumer doesn't > have to make assumptions about indices and it's a bit more explicit > about which OPP table of the device (a device can have multiple OPP > tables) corresponds to which ICC path. > > Going by your extreme argument, one can also claim that it's not the > ICC framework's job to make it easy for consumers to figure out the > source/destination endpoints or give them names and delete the > interconnect and interconnect-names properties. That's clearly just as > absurd a claim. > > > -Saravana > > > > > > That's not good design nor is it scalable. > > > > > > > > > > > > > > > > > > > 2. Similarly during parts screening in the factory, some of the > > > > > > > combinations might not have been screened and can't be guaranteed > > > > > > > to work. > > > > > > > > > > > > As above, it's the provider's job to select the final bandwidth > > > > > > according to its constraint > > > > > > > > > > Same reply as above. > > > > > > > > > > > > > > > > > > > 3. Only a certain set of bandwidth levels might make sense to use from > > > > > > > a power/performance balance given the device using it. For example: > > > > > > > - The big CPU might not want to use some of the lower bandwidths > > > > > > > but the little CPU might want to. > > > > > > > - The big CPU might not want to use some intermediate bandwidth > > > > > > > points if they don't save a lot of power compared to a higher > > > > > > > bandwidth levels, but the little CPU might want to. > > > > > > > - The little CPU might never want to use the higher set of > > > > > > > bandwidth levels since they won't be power efficient for the use > > > > > > > cases that might run on it. > > > > > > > > > > > > These example are quite vague about the reasons why little might never > > > > > > want to use higher bandwidth. > > > > > > > > > > How is it vague? I just said because of power/performance balance. > > > > > > > > > > > But then, if little doesn't ask high bandwidth it will not use them. > > > > > > > > > > If you are running a heuristics based algorithm to pick bandwidth, > > > > > this is how it'll know NOT to use some of the bandwidth levels. > > > > > > > > so you want to set a bandwidth according to the cpu frequency which is > > > > what has been proposed in other thread > > > > > > Nope, that's just one heuristic. Often times it's based on hardware > > > monitors measuring interconnect activity. If you go look at the SDM845 > > > in a Pixel 3, almost nothing is directly tied to the CPU frequency. > > > > > > Even if you are scaling bandwidth based on other hardware > > > measurements, you might want to avoid some bandwidth level provided by > > > the interconnect providers because it's suboptimal. > > > > > > For example, when making bandwidth votes to accommodate the big CPUs, > > > you might never want to use some of the lower bandwidth levels because > > > they are not power efficient for any CPU frequency or any bandwidth > > > level. Because at those levels the memory/interconnect is so slow that > > > it has a non-trivial utilization increase (because the CPU is > > > stalling) of the big CPUs. > > > > > > Again, this is completely different from what the providers/icc > > > framework does. Which is, once the request is made, they aggregate and > > > set the actual interconnect frequencies correctly. > > > > > > > > > > > > > > > > > > > > > > 4. It might not make sense from a system level power perspective. > > > > > > > Let's take an example of a path S (source) -> A -> B -> C -> D > > > > > > > (destination). > > > > > > > - A supports only 2, 5, 7 and 10 GB/s. B supports 1, 2 ... 10 GB/s. > > > > > > > C supports 5 and 10 GB/s > > > > > > > - If you combine and list the superset of bandwidth levels > > > > > > > supported in that path, that'd be 1, 2, 3, ... 10 GB/s. > > > > > > > - Which set of bandwidth levels make sense will depend on the > > > > > > > hardware characteristics of the interconnects. > > > > > > > - If B is the biggest power sink, then you might want to use all 10 > > > > > > > levels. > > > > > > > - If A is the biggest power sink, then you might want to use all 2, > > > > > > > 5 and 10 GB/s of the levels. > > > > > > > - If C is the biggest power sink then you might only want to use 5 > > > > > > > and 10 GB/s > > > > > > > - The more hops and paths you get the more convoluted this gets. > > > > > > > > > > > > > > 5. The design of the interconnects themselves might have an impact on > > > > > > > which bandwidth levels are used. > > > > > > > - For example, the FIFO depth between two specific interconnects > > > > > > > might affect the valid bandwidth levels for a specific path. > > > > > > > - Say S1 -> A -> B -> D1, S2 -> C -> B -> D1 and S2 -> C -> D2 are > > > > > > > three paths. > > > > > > > - If C <-> B FIFO depth is small, then there might be a requirement > > > > > > > that C and B be closely performance matched to avoid system level > > > > > > > congestion due to back pressure. > > > > > > > - So S2 -> D1 path can't use all the bandwidth levels supported by > > > > > > > C-B combination. > > > > > > > - But S2 -> D2 can use all the bandwidth levels supported by C. > > > > > > > - And S1 -> D1 can use all the levels supported by A-B combination. > > > > > > > > > > > > > > > > > > > All the examples above makes sense but have to be handle by the > > > > > > provider not the consumer. The consumer asks for a bandwidth according > > > > > > to its constraints. Then the provider which is the driver that manages > > > > > > the interconnect IP, should manage all this hardware and platform > > > > > > specific stuff related to the interconnect IP in order to set the > > > > > > optimal bandwidth that fit both consumer constraint and platform > > > > > > specific configuration. > > > > > > > > > > Sure, but the provider itself can have interconnect properties to > > > > > indicate which other interconnects it's tied to. And the provider will > > > > > still need the interconnect-opp-table to denote which bandwidth levels > > > > > are sensible to use with each of its connections. > > > > > > You seem to have missed this comment. > > > > > > Thanks, > > > Saravana > > > > > > > > So in some instances the interconnect-opp-table covers the needs of > > > > > purely consumers and in some instances purely providers. But in either > > > > > case, it's still needed to describe the hardware properly. > > > > > > > > > > -Saravana > > > > > > > > > > > > These are just some of the reasons I could recollect in a few minutes. > > > > > > > These are all real world cases I had to deal with in the past several > > > > > > > years of dealing with scaling interconnects. I'm sure vendors and SoCs > > > > > > > I'm not familiar with have other good reasons I'm not aware of. > > > > > > > > > > > > > > Trying to figure this all out by aggregating OPP tables of > > > > > > > interconnect providers just isn't feasible nor is it efficient. The > > > > > > > OPP tables for an interconnect path is describing the valid BW levels > > > > > > > supported by that path and verified in hardware and makes a lot of > > > > > > > sense to capture it clearly in DT. > > > > > > > > > > > > > > > So such kind of OPP table should be at > > > > > > > > provider level but not at path level. > > > > > > > > > > > > > > They can also use it if they want to, but they'll probably want to use > > > > > > > a frequency OPP table. > > > > > > > > > > > > > > > > > > > > > -Saravana > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Saravana Kannan <saravanak@xxxxxxxxxx> > > > > > > > > > --- > > > > > > > > > drivers/interconnect/core.c | 27 ++++++++++++++++++++++++++- > > > > > > > > > include/linux/interconnect.h | 7 +++++++ > > > > > > > > > 2 files changed, 33 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c > > > > > > > > > index 871eb4bc4efc..881bac80bc1e 100644 > > > > > > > > > --- a/drivers/interconnect/core.c > > > > > > > > > +++ b/drivers/interconnect/core.c > > > > > > > > > @@ -47,6 +47,7 @@ struct icc_req { > > > > > > > > > */ > > > > > > > > > struct icc_path { > > > > > > > > > size_t num_nodes; > > > > > > > > > + struct opp_table *opp_table; > > > > > > > > > struct icc_req reqs[]; > > > > > > > > > }; > > > > > > > > > > > > > > > > > > @@ -313,7 +314,7 @@ struct icc_path *of_icc_get(struct device *dev, const char *name) > > > > > > > > > { > > > > > > > > > struct icc_path *path = ERR_PTR(-EPROBE_DEFER); > > > > > > > > > struct icc_node *src_node, *dst_node; > > > > > > > > > - struct device_node *np = NULL; > > > > > > > > > + struct device_node *np = NULL, *opp_node; > > > > > > > > > struct of_phandle_args src_args, dst_args; > > > > > > > > > int idx = 0; > > > > > > > > > int ret; > > > > > > > > > @@ -381,10 +382,34 @@ struct icc_path *of_icc_get(struct device *dev, const char *name) > > > > > > > > > dev_err(dev, "%s: invalid path=%ld\n", __func__, PTR_ERR(path)); > > > > > > > > > mutex_unlock(&icc_lock); > > > > > > > > > > > > > > > > > > + opp_node = of_parse_phandle(np, "interconnect-opp-table", idx); > > > > > > > > > + if (opp_node) { > > > > > > > > > + path->opp_table = dev_pm_opp_of_find_table_from_node(opp_node); > > > > > > > > > + of_node_put(opp_node); > > > > > > > > > + } > > > > > > > > > + > > > > > > > > > + > > > > > > > > > return path; > > > > > > > > > } > > > > > > > > > EXPORT_SYMBOL_GPL(of_icc_get); > > > > > > > > > > > > > > > > > > +/** > > > > > > > > > + * icc_get_opp_table() - Get the OPP table that corresponds to a path > > > > > > > > > + * @path: reference to the path returned by icc_get() > > > > > > > > > + * > > > > > > > > > + * This function will return the OPP table that corresponds to a path handle. > > > > > > > > > + * If the interconnect API is disabled, NULL is returned and the consumer > > > > > > > > > + * drivers will still build. Drivers are free to handle this specifically, but > > > > > > > > > + * they don't have to. > > > > > > > > > + * > > > > > > > > > + * Return: opp_table pointer on success. NULL is returned when the API is > > > > > > > > > + * disabled or the OPP table is missing. > > > > > > > > > + */ > > > > > > > > > +struct opp_table *icc_get_opp_table(struct icc_path *path) > > > > > > > > > +{ > > > > > > > > > + return path->opp_table; > > > > > > > > > +} > > > > > > > > > + > > > > > > > > > /** > > > > > > > > > * icc_set_bw() - set bandwidth constraints on an interconnect path > > > > > > > > > * @path: reference to the path returned by icc_get() > > > > > > > > > diff --git a/include/linux/interconnect.h b/include/linux/interconnect.h > > > > > > > > > index dc25864755ba..0c0bc55f0e89 100644 > > > > > > > > > --- a/include/linux/interconnect.h > > > > > > > > > +++ b/include/linux/interconnect.h > > > > > > > > > @@ -9,6 +9,7 @@ > > > > > > > > > > > > > > > > > > #include <linux/mutex.h> > > > > > > > > > #include <linux/types.h> > > > > > > > > > +#include <linux/pm_opp.h> > > > > > > > > > > > > > > > > > > /* macros for converting to icc units */ > > > > > > > > > #define Bps_to_icc(x) ((x) / 1000) > > > > > > > > > @@ -28,6 +29,7 @@ struct device; > > > > > > > > > struct icc_path *icc_get(struct device *dev, const int src_id, > > > > > > > > > const int dst_id); > > > > > > > > > struct icc_path *of_icc_get(struct device *dev, const char *name); > > > > > > > > > +struct opp_table *icc_get_opp_table(struct icc_path *path); > > > > > > > > > void icc_put(struct icc_path *path); > > > > > > > > > int icc_set_bw(struct icc_path *path, u32 avg_bw, u32 peak_bw); > > > > > > > > > > > > > > > > > > @@ -49,6 +51,11 @@ static inline void icc_put(struct icc_path *path) > > > > > > > > > { > > > > > > > > > } > > > > > > > > > > > > > > > > > > +static inline struct opp_table *icc_get_opp_table(struct icc_path *path) > > > > > > > > > +{ > > > > > > > > > + return NULL; > > > > > > > > > +} > > > > > > > > > + > > > > > > > > > static inline int icc_set_bw(struct icc_path *path, u32 avg_bw, u32 peak_bw) > > > > > > > > > { > > > > > > > > > return 0; > > > > > > > > > -- > > > > > > > > > 2.22.0.410.gd8fdbe21b5-goog > > > > > > > > > > > > > > > > > > > > > -- > > > > > > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@xxxxxxxxxxx. > > > > > >