Dear All, > On 11.01.2014 06:25, Thomas Abraham wrote: > > On Fri, Jan 10, 2014 at 7:48 PM, Lukasz Majewski > > <l.majewski@xxxxxxxxxxx> wrote: > >> Hi Thomas, > >> > >>> Hi Lukasz, > >>> > >>> On Fri, Jan 10, 2014 at 5:34 PM, Lukasz Majewski > >>> <l.majewski@xxxxxxxxxxx> wrote: > >>>> Hi Thomas, > >>>> > >>>>> Add a new clock provider for ARM clock domain. This clock > >>>>> provider is composed of multiple components which include > >>>>> mux_core, div_core, div_core2, div_corem0, div_corem1, > >>>>> div_periph, div_atb, div_pclk_dbg, div_copy and div_hpm. This > >>>>> composition of mutiple components into a single clock provider > >>>>> helps with faster completion of CPU clock speed switching > >>>>> during DVFS operations. > >>>>> > >>>>> Signed-off-by: Thomas Abraham <thomas.ab@xxxxxxxxxxx> > >>>>> --- > >>>>> drivers/clk/samsung/clk-exynos4.c | 96 > >>>>> ++++++++++++++++++++++++++++++++++++- 1 files changed, 95 > >>>>> insertions(+), 1 deletions(-) > >>>>> > >>>>> diff --git a/drivers/clk/samsung/clk-exynos4.c > >>>>> b/drivers/clk/samsung/clk-exynos4.c index d967571..4bf2f93 > >>>>> 100644 --- a/drivers/clk/samsung/clk-exynos4.c > >>>>> +++ b/drivers/clk/samsung/clk-exynos4.c > >>>>> @@ -108,8 +108,11 @@ > >>>>> #define APLL_CON0 0x14100 > >>>>> #define E4210_MPLL_CON0 0x14108 > >>>>> #define SRC_CPU 0x14200 > >>>>> +#define STAT_CPU 0x14400 > >>>>> #define DIV_CPU0 0x14500 > >>>>> #define DIV_CPU1 0x14504 > >>>>> +#define DIV_STAT_CPU0 0x14600 > >>>>> +#define DIV_STAT_CPU1 0x14604 > >>>>> #define GATE_SCLK_CPU 0x14800 > >>>>> #define GATE_IP_CPU 0x14900 > >>>>> #define E4X12_DIV_ISP0 0x18300 > >>>>> @@ -289,7 +292,7 @@ static unsigned long exynos4_clk_regs[] > >>>>> __initdata = { }; > >>>>> > >>>>> /* list of all parent clock list */ > >>>>> -PNAME(mout_apll_p) = { "fin_pll", "fout_apll", }; > >>>>> +PNAME(mout_apll_p) = { "fin_pll", "fout_apll1", }; > >>>>> PNAME(mout_mpll_p) = { "fin_pll", "fout_mpll", }; > >>>>> PNAME(mout_epll_p) = { "fin_pll", "fout_epll", }; > >>>>> PNAME(mout_vpllsrc_p) = { "fin_pll", "sclk_hdmi24m", }; > >>>>> @@ -306,6 +309,7 @@ PNAME(mout_onenand_p) = {"aclk133", > >>>>> "aclk160", }; PNAME(mout_onenand1_p) = {"mout_onenand", > >>>>> "sclk_vpll", }; > >>>>> > >>>>> /* Exynos 4210-specific parent groups */ > >>>>> +PNAME(armclk_p) = { "fout_apll", }; > >>>> > >>>> Here you only give no parent clock, but at > >>>> samsung_coreclk_register() it is expected to provide list of > >>>> parents. > >>> > >>> Here only one parent is listed, but the core clock type does not > >>> limit the number of parents that can be specified. A specific > >>> implementation can define and use multiple parents. > >> > >> I only pointed out that the definition of the: > >> > >> samsung_coreclk_register("armclk", armclk_p, > >> ARRAY_SIZE(armclk_p), "fout_apll", > >> &exynos4210_armclk_clk_ops, arm_clk, > >> &exyno4210_armclk_table); > >> > >> Could only use parent, especially when you plan to change mux clock > >> (apll vs. mpll) by writing directly to registers (which I think is > >> bad). > > > > This definition is not limited to be used only on Exynos4210. This > > is a generic core clock registration helper function intended to be > > reusable across multiple Samsung SoCs. > > > > I think Lukasz meant that you should rather use parent list to pass > any input clocks of the core clock block, instead of hardcoded clock > look-up inside clock ops. Yes, that was my point. > > >> > >>> > >>>> > >>>>> PNAME(sclk_vpll_p4210) = { "mout_vpllsrc", > >>>>> "fout_vpll", }; PNAME(mout_core_p4210) = { "mout_apll", > >>>>> "sclk_mpll", }; PNAME(sclk_ampll_p4210) = { "sclk_mpll", > >>>>> "sclk_apll", }; @@ -1089,6 +1093,92 @@ static struct > >>>>> samsung_pll_clock exynos4x12_plls[nr_plls] __initdata = > >>>>> { VPLL_LOCK, VPLL_CON0, NULL), }; > >>>>> > >>>>> +#define EXYNOS4210_DIV_CPU0(apll, pclk_dbg, atb, periph, > >>>>> corem1, corem0) \ > >>>>> + ((apll << 24) | (pclk_dbg << 20) | (atb << 16) | \ > >>>>> + (periph << 12) | (corem1 << 8) | (corem0 << 4)) > >>>>> +#define EXYNOS4210_DIV_CPU1(hpm, copy) \ > >>>>> + ((hpm << 4) | (copy << 0)) > >>>>> +static const unsigned long exynos4210_armclk_data[][2] = { > >>>>> + { EXYNOS4210_DIV_CPU0(7, 1, 4, 3, 7, 3), > >>>>> EXYNOS4210_DIV_CPU1(0, 5), }, > >>>>> + { EXYNOS4210_DIV_CPU0(7, 1, 4, 3, 7, 3), > >>>>> EXYNOS4210_DIV_CPU1(0, 4), }, > >>>>> + { EXYNOS4210_DIV_CPU0(7, 1, 3, 3, 7, 3), > >>>>> EXYNOS4210_DIV_CPU1(0, 3), }, > >>>>> + { EXYNOS4210_DIV_CPU0(7, 1, 3, 3, 7, 3), > >>>>> EXYNOS4210_DIV_CPU1(0, 3), }, > >>>>> + { EXYNOS4210_DIV_CPU0(7, 1, 3, 3, 7, 3), > >>>>> EXYNOS4210_DIV_CPU1(0, 3), }, > >>>>> + { EXYNOS4210_DIV_CPU0(0, 1, 3, 1, 3, 1), > >>>>> EXYNOS4210_DIV_CPU1(0, 3), }, +}; > >>>>> + > >>>> > >>>> What do you think about adding those parameters (like CPU > >>>> dividers) as an attribute to /cpus/cpu@0 node? > >>> > >>> Not in CPU node but may be in clock controller node since these > >>> values are actually used by the clock controller. > >> > >> /cpus/cpu@0 seems like a good place for them (since those DIVs are > >> related to core) > > > > DIVs belong to the clock controller, not the CPU, and are addressed > > from the clock controller address space. > > I also think that the contents of cpu@0 node should be limited only > to CPU specific data. Personally I don't even like the idea of having > operating points there - I would rather see frequency limits there, > i.e. maximum frequency allowed at given voltage; specific frequency > values could be inferred from available APLL/core clock > configurations. > > > > >> . > >> However, we can choose any better DT node to add it. > >> > >>> But since these values are > >>> Exynos4210 specific and not generic enough to be reused across > >>> multiple Exynos SoCs, there is little benefit in defining > >>> bindings and parsing code for these values. It would be simpler > >>> enough to just embed them in the code. > >> > >> It would be less to code, but isn't it the same ugly code, which we > >> have now at exynos4xxx-cpufreq.c? > >> > >> With those values parsed from DT we can write generic code for the > >> "arm_clk" clock. One clock implementation for cpufreq-cpu0.c (and > >> maybe for arm_big_little.c) reused by Exynos4/5. > > > > As replied in the 2/3 patch, if these values would change across > > multiple platforms based on the same SoC, it makes sense to put them > > into DT. Any data that is purely SoC specific and not going to > > change across platforms can be embedded into the code itself. > > Well, they do change. If not on per board basis, then at least with > SoC revisions. I've added a real life example at my reply to patch 2/6 > > Anyway, the biggest problem is that the same data needs to be > duplicated (well, triplicated) for each driver that needs them. If > you can find a reasonable way to avoid redundancy, without having DT > involved, I will probably be fine with it. > > >>> > >>> These are frequencies supported by the core clock. But the cpufreq > >>> table can use all or subset of the supported frequencies. > >> > >> I see your point, but I find this distinction here a bit > >> superfluous. > > > > Replied to similar comment in 2/3 patch. > > > >> > >>> The core > >>> clock should be usable with the clock api independently and not > >>> tied to be used only by cpufreq driver. > >> > >> But then still for Exynos it will use PLL's M P S coefficients > >> which only corresponds to values defined at cpufreq's frequency > >> table. > > > > The PLL clocks are now separated out as PLL clock types in > > samsung/clk-pll.c file. The P,M,S values of the PLLs are now handled > > over there. So now the PLL is independent of the cpufreq driver and > > can support any number of clock speeds not limited to the ones > > needed by cpufreq. > > Don't forget about opposite side of this relation. The APLL needs to > support at least those that are required by cpufreq. > > > > >> > >> The set of frequencies for PLL, cpufreq and this clock is the > >> same, so I think that we shall not define them in three different > >> places. > >> > >> Could you give any example supporting your point of view? > > > > A PLL is a hardware component that can be reused in multiple SoCs. A > > PLL can generate and support 'x' number of clock speeds but a SoC > > using that PLL might use only 'y' (a subset of 'x') number of clock > > speeds of the PLL due to certain hardware limitations. Then there > > are platforms using a SoC which might use only 'z' (a subset of > > 'y') clock speeds due to the power/performance requirements of the > > platform. > > > > As observed with our platforms, 'x' is usually SoC or SoC-revision > specific and equal to 'y' and 'z' on respective platforms. > > >> > >>> > >>>> > >>>>> + > >>>>> +static const struct samsung_core_clock_freq_table > >>>>> exyno4210_armclk_table = { > >>>>> + .freq = exynos4210_armclk_freqs, > >>>>> + .freq_count = ARRAY_SIZE(exynos4210_armclk_freqs), > >>>>> + .data = (void *)exynos4210_armclk_data, > >>>>> +}; > >>>>> + > >>>>> +static int exynos4210_armclk_set_rate(struct clk_hw *hw, > >>>>> unsigned long drate, > >>>>> + unsigned long prate) > >>>>> +{ > >>>>> + struct samsung_core_clock *armclk; > >>>>> + const struct samsung_core_clock_freq_table *freq_tbl; > >>>>> + unsigned long *freq_data; > >>>>> + unsigned long mux_reg, idx; > >>>>> + void __iomem *base; > >>>>> + > >>>>> + if (drate == prate) > >>>>> + return 0; > >>>>> + > >>>>> + armclk = container_of(hw, struct samsung_core_clock, hw); > >>>>> + freq_tbl = armclk->freq_table; > >>>>> + freq_data = (unsigned long *)freq_tbl->data; > >>>>> + base = armclk->ctrl_base; > >>>>> + > >>>>> + for (idx = 0; idx < freq_tbl->freq_count; idx++, freq_data > >>>>> += 2) > >>>>> + if ((freq_tbl->freq[idx] * 1000) == drate) > >>>>> + break; > >>>>> + > >>>>> + if (!armclk->fout_pll) > >>>>> + armclk->fout_pll = __clk_lookup("fout_apll");\ > >>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^[*] > >>>> > >>>> I'm a bit confused here for two reasons. Please correct me if I'm > >>>> wrong. > >>>> > >>>> 1. You go into this ->set_rate() because of calling clk_set_rate > >>>> at "arm_clk" clock (numbered as 12 at clk-exynos4.c) at > >>>> cpufreq-cpu0.c > >>>> > >>>> In a Exynos4210 we have: > >>>> XXTI-> APLL -> fout_apll -> mout_apll -> mout_core -> div_core > >>>> -> div_core2 -> arm_clk > >>>> > >>>> In the code you call directly the fout_apll which changes > >>>> frequency. Then the change shall be propagated to all registered > >>>> clocks. > >>>> I think, that DIV and DIV1 shall be reduced before PLL change > >>>> [*], to reflect the changes at CCF. > >>> > >>> The core clock implementation encapsulates multiple clock blocks > >>> (such as dividers and muxes) which are in between the output of > >>> the APLL and the point that actually is the cpu domain clock > >>> output. > >> > >> No problem with that. I mostly agree... > >> > >>> When a clock > >>> frequency change has to be made, all these clock blocks > >>> encapsulated within the core clock are programmed by > >>> pre-determined values. > >> > >> And what about the situation with already defined clocks (like > >> "div_core" and "div_core2"). Those will not be updated when you > >> first call clk_set_rate() and change by hand DIV and DIV1. > >> > >> What if you would like to have the PCLK_DBG clock used in the > >> future? You would add it to CCF and the change will not propagate. > > > > I did intend to remove individual clock blocks which are now > > encapsulated within the core clock type from the clock driver file. > > I missed doing that in this patch series. > > Yes, they should be removed, since they are encapsulated inside the > core clock now. This was not my point here. You are creating an opaque clock for cpu (like "samsung_clock_cpu" [*]) , which encapsulates many other clocks (mostly DIVs and MUXes to be more precise). I don't think, that those clocks shall be removed from CCF. I do believe, that they shall be passed as the argument to [*] and recalculated. Why the advent of [*] clock forbids me from creating and using clocks based on "div_core", "div_core2" and PCLK_DBG? What if I would like to see output of those particular clocks at /sys/kernel/debug/ or use them at a custom driver? > > > > >> > >>> This > >>> approach allows very fast clock speed switching, instead of > >>> traversing the entire CCF clock tree searching for individual > >>> clock blocks to be programmed. > >> > >> Those are mostly DIV and MUXes. Recalculation shouldn't be time > >> consuming. > > > > I was mainly referring to the time taken to search the clock tree > > for these individual clock blocks. > > Hmm, why couldn't you simply look-up all the needed clock at > initialization and keep references to them? +1 > > Still, I think it is fine to directly program registers of > encapsulated clocks, And then recalculate values of relevant clocks build on top of those registers. > since they are not visible outside anymore and > there is no need for them to be visible. Why do you assume, that those clocks won't be needed anymore? For me clocks like [*] are valid clocks from CCF/user point of view. > > > > >> > >>> > >>>> > >>>> > >>>>> + > >>>>> + if (drate < prate) { > >>>>> + mux_reg = readl(base + SRC_CPU); > >>>>> + writel(mux_reg | (1 << 16), base + SRC_CPU); > >>>>> + while (((readl(base + STAT_CPU) >> 16) & 0x7) != > >>>>> 2) > >>>>> + ; > >>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [**] > >>>> > >>>> 2. I think, the above shall be done in a following way: > >>>> > >>>> clk_set_parent(mout_core, mout_mpll); > >>>> clk_set_rate(armclk->fout_pll, drate); > >>>> clk_set_parent(mout_core, mout_apll); > >>>> > >>>> The direct write to registers [**] doesn't look compliant to CCF. > >>>> > >>> > >>> As mentioned above, the clock block encapsulates these clock > >>> blocks into a single clock and only this single encapsulated > >>> clock is registered with CCF. The internal implementation of how > >>> the different clock blocks are managed within this clock is > >>> independent of the CCF. > >> > >> I agree, that the CPU_DIV and CPU_DIV1 shall be changed atomically > >> (without CCF). > >> > >> But on the situation [**] the MUX can be changed by > >> clk_set_parent() as it is now done at exynosXXXX-cpufreq.c code. > > > > The mux is also encapsulated into a larger clock type and this new > > clock type know how the mux has to be configured. > > IMHO it's fine to encapsulate the mux as well. There are no users of > it other than the core clock. In spite of the above comment, it looks far more better to change clocks with CCF API (like clk_set_parent), not by writing directly to registers. > > > > >> > >> > >>> > >>>> > >>>> I'd rather thought about using "mout_core" instead of "arm_clk". > >>>> Then we would get access to the parent directly: > >>>> > >>>> struct clk *parent = clk_get_parent(hw->clk); > >>>> > >>>> so we set the parents explicitly (at clk registration) and call > >>>> ->recalc_rate for clocks which are lower in the tree (like > >>>> "div_core", "div_core2"). > >>> > >>> That was not the intention as mentioned above. > >> > >> This is just another possible solution to the problem. > > Those clocks should not be dealt with separately and this was the > intention of this composite clock. I agree with Thomas here. I cannot agree here, as stated at the above comment. > > Best regards, > Tomasz > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@xxxxxxxxxxxxxxxxxxx > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel Best regards, Lukasz Majewski
Attachment:
signature.asc
Description: PGP signature