On Tue, Jul 23, 2024 at 10:23 AM Sam Protsenko <semen.protsenko@xxxxxxxxxx> wrote: > > On Tue, Jul 23, 2024 at 9:17 AM Mateusz Majewski > <m.majewski2@xxxxxxxxxxx> wrote: > > > > > Do you know what are the possible implications of not using ACPM? As I > > > understand, ACPM is a Samsung's downstream framework which uses APM > > > (Active Power Management) IP block internally to act as an IPC > > > mechanism, which makes it possible to offload any PM related > > > operations (which might get quite heavy, if we are to belive the TRM > > > description of APM) from CPU to APM. I'm not against the direct > > > registers access based implementation (in fact, I'm not sure how that > > > APM/ACPM thing can be implemented in upstreamable way and if it's > > > worth it at all). Just curious if we understand what we are > > > potentially missing out, and if at some point we'll be forced to > > > implement that ACPM thing anyway (for something else)? > > > > Not sure honestly. The downstream v4.10 driver does many operations on > > registers anyway...? > > > > > Not sure if that's true, as already discussed in my comments for the > > > previous patches. Looks like one clock is still needed, which is the > > > PCLK bus clock (to interface registers) which might simultaneously act > > > as an operating (functional) clock. > > > > The code seems to be working correctly without this clock, both register > > reads and writes. Maybe the support for extra sensors, which I couldn't > > get to work, would require this clock? > > > > Chances are that clock was enabled by the bootloader for us (or it's > just enabled by default) and it just keeps running. If that's so, I'd > say it must be described in dts and controlled by the driver. Because > otherwise it might get disabled at any point in future, e.g. kernel > may disable it during startup as an unused clock (when it's added to > the clock driver), etc. Let me enable that clock for you, and then you > can use /sys/kernel/debug/clk/ files to disable it manually and see if > it actually affects TMU driver. > Yeah, that clock is definitely needed. Just submitted the series [1] adding it, which makes the kernel stuck on startup when your series is applied. To fix that I added next lines to the TMU node in dts: 8<--------------------------------------------------------------------------->8 tmuctrl_0: tmu@10070000 { compatible = "samsung,exynos850-tmu"; reg = <0x10070000 0x800>; interrupts = <GIC_SPI 226 IRQ_TYPE_LEVEL_HIGH>; + clock-names = "tmu_apbif"; + clocks = <&cmu_peri CLK_GOUT_BUSIF_TMU_PCLK>; #thermal-sensor-cells = <0>; }; 8<--------------------------------------------------------------------------->8 Please rework your patches to account for that required clock. Alas the TMU dts changes can't be submitted until my series [1] is applied. But you can still apply my series locally, and I think the driver and bindings changes don't depend on that clock, so it should be ok to send those Thanks! [1] https://lore.kernel.org/linux-samsung-soc/20240723163311.28654-2-semen.protsenko@xxxxxxxxxx/T/#mf28e4aab0111b95479ef632bc1979dff93d28cc7 > > > Exynos850 TRM says AVG_CONTROL offset is 0x38, and 0x58 is actually > > > for THRESHOLD0_TEMP_RISE3_2 register. > > > > Thank you so much! Will fix in v2. Though writing to the right place > > doesn't seem to change much in practice, probably just means that the > > correct mode is being used. > > > > > Something seems off to me here. How come the shift value for EXYNOS7 > > > case is 8, but the mask is actually 9 bits long? Does it mean the > > > first error field is 8 bits long, and the second error field is 9 bits > > > long for EXYNOS7? I don't have the Exynos7 manual, so it's just a > > > hunch. But if it's true, maybe this shift value has to be added in > > > your [PATCH 2/6] to fix Exynos7 case? > > > > I did not really want to mess with Exynos7 code, as we don't have an > > Exynos7 board sadly. Honestly I feel like I should drop the 2/6 patch > > completely and only modify the code to run on 850 correctly. > > > > It feels like there is an error for Exynos7 case there. Take a look at > this commit: > > aef27b658b43 ("thermal: exynos: use sanitize_temp_error() in > exynos7_tmu_initialize()") > > I think that commit just forgets to update the shift value for Exynos7 > properly. This code: > > data->temp_error1 = trim_info & tmu_temp_mask; > data->temp_error2 = ((trim_info >> EXYNOS_TRIMINFO_85_SHIFT) & > EXYNOS_TMU_TEMP_MASK); > > in case of Exynos7 becomes: > > data->temp_error1 = trim_info & 0x1ff; // mask = 9 bits > data->temp_error2 = (trim_info >> 8) & 0xff; > > it contradicts itself, because it takes 9 rightmost bits for error1, > and then uses 1 of those bits for error2 too. It's obvious that if 9 > bits are already used for error1, then for error2 it has to be shifted > by 9 bits, not 8. > > That's why I think your patch 2/6 is legit and useful on its own, and > it's actually a good catch on your part! But the shift value has to be > fixed as well (for Exynos7). It's not ideal you don't have the > hardware to test it, but it just screams *bug* to me :) Also, maybe we > can ask someone who has Exynos7 hardware to test it for us? > > > > Also, just an idea: those values (and other similar values) could be > > > pre-calculated somewhere during the probe, stored in some struct (e.g. > > > _variant or _chip) and then just used here. > > > > sanitize_temp_error is only called one per probe and once per resume, so > > probably little to gain? > > > > Sure, it was just a minor suggestion to make the code look more linear > so to speak. It can be totally skipped. > > > Will also do all other.