Hi, Although I agree that the current clock framework can handle power or voltage domains in many platforms, having something like (struct clk powerdomain1, powerdomain2;) does not seem like a good implementation, a struct for clocks representing a power domain. If a new framework is more straighforward and introduces a negligible overhead to the current kernel, I think it is worthwhile to have a look at it. Plus this new framework might be able to take care of those platforms that are not nicely supported by the current clock framework. Ikhwan On 3/16/07, David Brownell <david-b@xxxxxxxxxxx> wrote: > On Thursday 15 March 2007 6:04 am, Igor Stoppa wrote: > > On Thu, 2007-03-15 at 12:53 +0300, ext Eugeny S. Mints wrote: > > > > On Wednesday 14 March 2007 3:43 am, Eugeny S. Mints wrote: > > > >>> Would this involve replacing the clock framework, or are they going to coexist? > > > >> parameter framework would eventually replace clock framework. > > > > > > > > That seems to be the wrong answer. Especially since nothing has > > > > been shown to be wrong with the clock interface; much less to be > > > > unfixably wrong (hence justifying replacement). > > > > > > a cherry-picking on clk fw API: > > > > > > - clk_set_rate() sticks to an individual clock - no way to set rates for number > > > of clocks at once instead of having series of clk_set_rate() calls. > > This isn't "wrong", it's a "lack of feature". The normal process for > addressing such things is to improve working/deployed software interfaces, > not throw them out and try to create and deploy something new. :) > > Plus, that's not actually true. When you change the rate for a parent > clock, all its derived clocks necessarily change ... however many they > may be. And there's nothing to prevent updating non-derived clocks. > > Of course, such a change *does* touch on something I'd agree is a notable > omission: requests to change such a clock rate (call it a "base" rate > for purposes of discussion here) are currently safe only if the related > clocks have no drivers using them, since there's no way for those driver > to either veto the change ("I can't support 115200 baud at that base rate"), > or reclock to handle the changed environment ("update clock divisor to > maintain current serial port baud rate"). > > Which is why currently most clock rates aren't currently changeable; and > those which can be changed are mostly leaves on the clock tree. > > On some platforms that significantly impacts what cpufreq (or whatever > DVFS scheme is used) could achieve. > > > > > The former > > > for example is required for clocks which have required predefined ratio > > > dependency though: two clocks always have to be n:2n and any other ratio leads > > > to hardware misbehavior. > > The programming interface doesn't preclude enforcement of that rule > though. In fact, I'd call any *implementation* buggy if the hardware > has such a rule about base rates, but the software doesn't enforce it. > (That is: it's not an interface issue.) > > > > > So, you can't go from 100:200 to 300:600 via series of > > > clk_set_rate() for such clocks, i.e. 100:200 -> 300:200 -> 300:600 or 100:200 -> > > > 100:600 -> 300:600 (see SH7722 hw for reference) > > AT91 (e.g. at91sam9263) processors have MCKR with similar restrictions, > but the only time this could ever matter seems to be for cpufreq. > > Does this matter to you for any other reason than trying to use the > clock framework to implement cpufreq? If not, there's no need for > the clock framework to define how to do this ... since cpufreq is > so highly platform-specific. > > If you want to say that this makes it hard to implement cpufreq in terms > of the clock API, I'd agree ... but then, the *complete* lack of driver > clock change notifications is a far bigger issue. Drivers like MMC/SD, > UART, SPI, and I2C commonly need to update internal clock settings when > rates change, and may need to plug their lowlevel I/O queues while those > base clock rates change. > > > > What's wrong with expanding the clk_fw? > > All is needed is: > > clk_set_rate_buffered(clk1, 300); > > clk_set_rate_buffered(clk2, 600); > > clk_rate_flush(); /* can include validation of the set */ > > > > Which is, incidentally, what OMAP2 does in hw for all the relevant clk > > dividers and it also provides validation for the new set of values. > > > > Furthermore if the original assumption that complex transitions are > > allowed only atomically (p1A, p1B) => (p2A, p2B), hw support is > > mandatory, otherwise the transition is impossible, no matter what fancy > > sw fw is performing it. > > I think Scott's "power transaction" notion is a bit better, since > it should among other things provide a way to cope with cases like > two concurrent tasks trying issue conflicting changes. An SMP > system ought to be able to reclock two or more CPU cores at once; > and maybe even do it with one "power transaction". > > > > > - only a rate can be set up via clk_set_rate() while for a PLL I want to set up > > > a desired idle state as well (btw there can be more than one idle state) > > > > Is there any PLL which provide automatic switching between running/idle state? > > Do you have any HW architecture on your mind? > > And why should such a hardware-specific mechanism show up in an > all-platforms API? There's quite a lot of hardware-specific stuff > going on, and that seems like just one more instance of it. > > > > > - current API does not provide support for clock domains > > > > So? > > Exactly. :) > > And again, that's not strictly true either; the clock tree itself > defines one kind of domain, also known as "subtrees". > > A point I'll make below with respect to power domains: unless > those domains need to be exposed for some reason to drivers, > there is no need for these to show up the programming interface > except as implementation artifacts. As in: activate this clock, > and platform-specific stuff happens. It always does, of course, > but in some cases it will be more extensive than in others. > > (Activating a PLL normally takes more than setting a bit to > open a clock gate; activating some clocks might invove kicking > some clock or power domain logic.) > > > > > a cherry-picking on clk fw implementation: > > Which platform's implementation? I've seen several... > > > > > - clock tree structure and traversing clock tree code are duplicated in every > > > architecture while can be done in arch independent way and just once (hint: what > > > indeed is an arch specific thing it's a clock tree configuration) > > > > that looks more like an optimisation to the clk fw, not a call for a brand new one > > And I'm not even sure calling it an "optimization" is correct. > > Given how arch-specific the tree structure is, it's not clear > that traversing it shouldn't be arch-specific too. And in any > case ... the traversal code I've seen is so trivial I'd have a > hard time worring about sharing it. Implementation flexibility > seems to be widely useful in handling arch-specific differences. > > > > > >> Separate clock and voltage frameworks > > Not that we _have_ a voltage framework yet, e.g. "turn on that > that power rail, configured at 3.3V". > > Plus rememember that a voltage framework is not synonymous with > a DVFS kit ... and that a DVFS kit may need to interact with a > lot more than just voltage and frequency controls. Example, > reclocking CPU and memory frequency may need to run out of SRAM > and without accessing page tables stored in DRAM. > > > > > >> lead to code and functionality duplication > > But since a clock is not a voltage, the primitives must differ. > There's no analogue of "allocate 80 mA" for a clock, as there > would be with a voltage framework's power budget primitives. > > > > > >> and do not address > > > >> such things as relationship between clocks and voltages, clock/voltage/power > > > >> domains, etc needed for aggressive power management. > > > > relationships are very arch specific so having or not the fw there > > wouldn't probably make any real difference. > > About power domains ... well, what's the deal? > > Aren't they controlled like voltages, in the end? > > The last time I looked at a power domain, it was more like a > prerequisite to using various device clocks. It wasn't something > that needed to be exposed in an API ... it would suffice to > have it be activated/deactivated as a side effect of accessing > those clocks. > > That is, the existing hooks suffice. The platform-specific code > to enable a given clock subtree would just have a few lines of > code teaching it how to enable the relevant power domain. They > already rely on platform-specific logic to sort out which register > bank to touch, which bits there, and various other things. > > > Although that recent work for memory partitioning is relevant > here too: being able to power down banks of DRAM (ensuring they > hold no data or code that's currently needed) can certainly cut > down one category of power drain. > > In the same vein, some utilities to manage on-chip SRAM would > likely be helpful. It's common that certain low power modes > disable all DRAM, leaving only (say) 16 KB of SRAM available > for running code ... or that DRAM be unavailable while CPU > and/or memory are being reclocked (maybe by a DVFS scheme, > using cpufreq or whatever). > > > > > > Most clocks don't have those issues. Why penalize all clocks for > > > > issues which only relate to a few? Better to only do that for the > > > > few cocks which have such additional constraints. > > > > > > parameter fw would give exactly this behavior: relationship between clocks and > > > voltages, clock/voltage/power domains implementation would be just additional > > > arcs between nodes for clock and voltages. Nowadays clk fw has only one type of > > > arcs - parent-child type. parameter fw would bring additional types. But clock > > > nodes would be linked just with required arcs (of required type; both are arch > > > specific things) so for an arch without any additional dependencies between > > > clocks (and voltages) parameter fw would end up with exactly equivalent clock > > > tree as clk fw for the arch has today. > > > > I think what has to be clarified, is wether there is a real need for new > > features or not. > > Not just "new features" but "new cross-platform features". This PM stuff > is really down'n'dirty code ... who will be working with such code? The less > we demand of those people, the more likely it is that such features would > actually be delivered in working form. A lot of platform vendors work > with older kernel releases (2.6.10 etc), which means that new APIs will > be obstacles to development and deployment. > > Going back to clock and power domain examples above: it's *easy* to see how > those could be delivered using the current clock framework. There are a LOT of > folk who are familiar with those interfaces -- today. So unless there really > is no way to address the issue with the current APIs, it's best to avoid > creating new interfaces for those issues. > > > > > > Plus, remember that the clock framework is an interface ... so by > > > > definition, it has no code associated with it. Hence no duplication > > > > of code is possible... at least at this hand-wavey "concept" level. > > > > Possibly a given implementation has scope for code sharing; but I > > > > doubt it. Code behind a given implementation of the clock interface > > > > is invariably quite slim. > > > > > > and invariably looks like a hack and still duplicate clock tree building and > > > traversing code. > > Most cases I've seen build the clock tree with a bunch of static struct > declarations, so the building is all at compile time ... and could hardly > duplicate something on another platform, since it's all platform-specific. > And as noted above, traversal code is trivial ... > > > > > Dependencies which exist in modern hw between clocks, clocks > > > and voltages require a more straight and standard technique to be set up for > > > implementing clk/vltg/name_it framework. > > I think Linus has pointed out that cut'n'paste is a very straightforward > and standard technique. :) > > The clock-to-clock dependencies are handled by the clock framework > already. As are at least some cases of power domains. > > DVFS -- cpufreq or otherwise -- is the primary example of a case where > coupling is needed between clocks and voltages. These PM discussions > lately seem to dance around DVFS without saying so explicitly ... seemingly, > to avoid the "so why not just fix cpufreq" discussions that would (should!) > naturally follow. > > My own two cents on the cpufreq thing: (a) it should support the driver > model better; (b) a generic "run from SRAM" solution would be an aid to > that and other PM code; (c) without clock framework support for drivers > being able to reject changes or else re-clock, changing base clocks is > often impractical; (d) I don't recall seeing anyone really answer the > "so why not ust fix cpufreq" question. > > > > > Moving most of the code to be arch > > > independent and setting a clear rules on how a clock/vltg tree configuration > > > would look like while just looking at a hw manual would help. > > > > I think clocks and voltages providers have very different behaviours: for example > > a voltage src can sometimes be put in quiescent state, where the voltage value is > > preserved, but the max current is significantly reduced, therefore minimizing the > > leakage. I wouldn't welcome such functionality to be merged with clock handling. > > Another good example of these APIs being domain-specific ... though > in a way it's a special case of the one I gave above. In a similar > vein, lower current (but not quiescent) modes may be able to use > alternate schemes for voltage regulation. The regulation may need > software inputs to switch schemes. > > > > > > If a clock being enabled implies a power or voltage domain being active, > > > > there's no reason that constraint shouldn't be enforced by whatever > > > > implementation a given platform uses. > > > > > > true. but it's the same functionality required on different arches. and it can > > > be done in arch independent way without penalties for arches which do not > > > require the functionality. What's the rationale to move it down to arch specific > > > code then? > > > > I'd rather ask: what's the benefit of merging apples and oranges? > > Someone may be tiring of a steady diet, and wants to become a chef? :) > > I'll repeat that power and clock domain examples above: the current > clock framework is **ALREADY** arch-independant in exactly the way > Eugeny sketched, for several platforms. > > > > > > And having a generic -- basically untyped -- notion of "parameter" > > > > seems significantly less good than having a typed notion, with > > > > type-specific operations. > > > > > > there is no type-specific operations. clk_set_rate() and vltg_set() basically do > > > the same thing - set a value for a pm resource provided by clock or voltage > > > device (regulator). The difference is that output of clock regulator is measured > > > in Hz and named 'frequesncy' and output of voltage regulator os measured in mV > > > and named 'voltage' which actually does not matter from API POV. So, i see more > > > sense in having param_set(), parame_set_state() (set_state primitive for idle > > > states) rather than in > > > > See my comment above about different, peculiar, behaviour of voltages. > > And that's just an example. > > One of several, I added more ... > > > > > clk_set_rate(), clk_set_state() > > > vltg_set(), vltg_set_state() > > vtg_allocate_budget() would have no clock domain analogue, > nor would vtg_free_budget() ... > > > > > and, our analysis shows that you would end up with a separate type for domains, > > > so it would be at least > > > domain_set_state() as well. > > See my comments above re clock and voltage domains, applicable to > at least half a dozen platforms. > > Until you share your analysis, so we can see what we agree and disagree > with, it's not helpful for you to claim that it supports your argument. > > > > > > Typed notions are easier to understand, > > > > read, and maintain. > > > > > > common, after tricks with spinlocks in [merged!] RT patches? ;) > > I don't follow this comment. I'm still waiting to see the NO_HZ stuff > become as stable and fast as the previous NO_IDLE_HZ stuff ... but > none of that seems to relate to type-unsafe programming interfaces. > > - Dave > > > > > > > > Eugeny > > > > _______________________________________________ > linux-pm mailing list > linux-pm@xxxxxxxxxxxxxx > https://lists.osdl.org/mailman/listinfo/linux-pm > _______________________________________________ linux-pm mailing list linux-pm@xxxxxxxxxxxxxx https://lists.osdl.org/mailman/listinfo/linux-pm