Hi, > Gesendet: Mittwoch, 13. September 2023 um 13:43 Uhr > Von: "AngeloGioacchino Del Regno" <angelogioacchino.delregno@xxxxxxxxxxxxx> > An: frank-w@xxxxxxxxxxxxxxx, "Frank Wunderlich" <linux@xxxxxxxxx>, linux-mediatek@xxxxxxxxxxxxxxxxxxx > Il 13/09/23 12:52, Frank Wunderlich ha scritto: > > Am 13. September 2023 10:16:51 MESZ schrieb AngeloGioacchino Del Regno <angelogioacchino.delregno@xxxxxxxxxxxxx>: > > Hi angelo, > > > > thanks for first look > > > >> Il 11/09/23 20:33, Frank Wunderlich ha scritto: > >>> From: Frank Wunderlich <frank-w@xxxxxxxxxxxxxxx> > >>> > >>> Add Support for mediatek fologic 880/MT7988. > >>> > >>> Signed-off-by: Frank Wunderlich <frank-w@xxxxxxxxxxxxxxx> > >>> --- > >>> drivers/thermal/mediatek/lvts_thermal.c | 73 +++++++++++++++++++++++++ > >>> 1 file changed, 73 insertions(+) > >>> > >>> diff --git a/drivers/thermal/mediatek/lvts_thermal.c b/drivers/thermal/mediatek/lvts_thermal.c > >>> index c1004b4da3b6..48b257a3c80e 100644 > >>> --- a/drivers/thermal/mediatek/lvts_thermal.c > >>> +++ b/drivers/thermal/mediatek/lvts_thermal.c > >>> @@ -82,6 +82,8 @@ > >>> #define LVTS_GOLDEN_TEMP_DEFAULT 50 > >>> #define LVTS_COEFF_A_MT8195 -250460 > >>> #define LVTS_COEFF_B_MT8195 250460 > >>> +#define LVTS_COEFF_A_MT7988 -204650 > >>> +#define LVTS_COEFF_B_MT7988 204650 > >>> #define LVTS_MSR_IMMEDIATE_MODE 0 > >>> #define LVTS_MSR_FILTERED_MODE 1 > >>> @@ -1272,6 +1274,67 @@ static int lvts_remove(struct platform_device *pdev) > >>> return 0; > >>> } > >>> +/* > >>> + * LVTS MT7988 > >>> + */ > >>> +#define LVTS_HW_SHUTDOWN_MT7988 117000 > >> > >> Are you sure that this chip's Tj is >117°C ?! > >> > >> Looks a bit high... if it is exactly 117°C, I would suggest cutting earlier, > >> either at 110 (safe side) or 115: after all, this is a life-saver feature and > >> the chip is actually never meant to *constantly* work at 110°C (as it would > >> degrade fast and say goodbye earlier than "planned"). > > > > I took values from SDK > > > > https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/+/refs/heads/master/target/linux/mediatek/files-5.4/drivers/thermal/mediatek/soc_temp_lvts.c#1483 > > > > That kernel also defines 117°C for MT8195, which leaves me a bit dubious. > > For safety, I would recommend using the same 105°C hw shutdown temperature > for 7988 as well: after all if you think about it, reaching that kind of > temperature means that there's a real problem (fan stopped working and/or > heatsink detached) that needs attention. ack, will change to same value and put the define on top, abve the mt8195 > This is because you'll have trip points in devicetree that should throttle > the CPU in case it reaches at least a dangerously high temperature (read: > a temperature outside the recommended continuous operation range), bringing > the temperatures down because of the throttling action; I would imagine > throttling the CPU a bit down at 80°C, then a bit more at 90°C - but then, > if the temps won't drop like that, there's clearly a HW-related issue that > must be addressed (like the fan/heatsink scenario that I just described). > > Though, take this as an advice; I'll respect whichever decision you'll take. > > >>> +//enum mt7988_lvts_domain { MT7988_AP_DOMAIN, MT7988_NUM_DOMAIN }; > >>> + > >>> +enum mt7988_lvts_sensor_enum { > >>> + MT7988_TS3_0, > >>> + MT7988_TS3_1, > >>> + MT7988_TS3_2, > >>> + MT7988_TS3_3, > >>> + MT7988_TS4_0, > >>> + MT7988_TS4_1, > >>> + MT7988_TS4_2, > >>> + MT7988_TS4_3, > >>> + MT7988_NUM_TS > >>> +}; > > > >> This enumeration should be definitions in bindings (mediatek,lvts-thermal.h). > >> > >> Besides, the LVTS is about internal temperatures, so those TS3_x and 4_x can > >> be renamed like what was done for MT8192 and MT8195: this is because you will > >> never see TS3_2 being CPU2 on a board and CPU4 on another, being those - again - > >> internal to the SoC, hence unchangeable. > > > > Right these sensors are internally only and i took naming from sdk to avoid confusion. And i have not more information about these internal sensors (special meaning),but their values are packed together to get the resulting (average) temperature. > > > >> Another reason is that you'll anyway have to refer to those sensors in the > >> devicetree to configure thermal trips and such, so... :-) > > > > In device tree it will look like this: > > > > https://github.com/frank-w/BPI-Router-Linux/blob/6.5-lvts/arch/arm64/boot/dts/mediatek/mt7988a.dtsi#L771 > > > > Daniel has also defined thermal trips there,but these are untested atm. I only verified temperature itself i get from sysfs as far as i can (start at ~40°C and reaching ~70 while running). > > > > Check how it's done in mt8192.dtsi and mt8195.dtsi: we're using the definitions > from the bindings for thermal zones. > At least for consistency (apart from other even more valid considerations), that's > how it should be done: please do it like that. > > Besides, I think that you can definitely guess what `TS` is CPU related: checking > the driver and devicetree from the downstream kernel that you provided, what I > understand is: > > 1. Your naming TS3,4 corresponds to TS2,3 downstream (so it's wrong?) > 2. Downstream TS2 is related to the CPU cores, so it should be > - TS2_0 - CPU0 > - TS2_1 - CPU1 > - TS2_2 - CPU2 > - TS2_3 - CPU3 i took the names from efuse names (which are listed as comment above), had 2/3 before. need to ask mtk here. same for the second controller and if it is really a "lvts-ap". > The other set of thermal sensors seem to be completely unused, so we cannot guess > just by looking at the code. Since you have the hardware, you may be able to take > a position about what they are by checking what sensor heats up for each "action" > (could be ethernet controller or infra or whatever else); if in doubt, just leave > them out, but add a comment saying that there are more sensors and say where, so > that if anyone finds what they're for, it'll be easy to add them. > > In any case, adding thermal sensors that you don't know about is meaningless, as > the sense of all this is: > - Monitoring temperatures of the hardware > - Taking action to prevent temperature overrun > but if you don't know what you're reading, you can't interpret temperatures for > something unknown, hence you can't as well take action to prevent overruns, as > you won't know what's the maximum temperature for "unknown thing" :-) yes i have only the downstream kernel and some additional information like interupt number and the efuse offsets for calibration. And for the 8 sensors i get this information: "8 sensors are distributed across the whole soc in order to get the correct average temperature. chip designer didn’t disclose more detailed info to us." so i thought all 8 sensors are taken into account with both controllers to calculate 1 resulting temperature. In downstream i saw also only 1 controller in dts and i have only 1 interrupt. https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/+/refs/heads/master/target/linux/mediatek/files-5.4/arch/arm64/boot/dts/mediatek/mt7988.dtsi#579 and the compatible for both (mt7988 and mt8195) are without "ap" there, so i took the same. Where have you seen the mt7988-AP ? i have only seen 1 sensor which should be the SOC itself and this chip has metal surface which makes it hard to get real temperature (infrared thermometer gives wrong temp). > Regards, > Angelo