Re: [PATCH v2 06/13] thermal: tegra: Do not register cooling device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 10, 2023 at 02:55:15PM +0100, Thierry Reding wrote:
> On Fri, Oct 13, 2023 at 05:57:13PM +0200, Daniel Lezcano wrote:
> > On 12/10/2023 19:58, Thierry Reding wrote:
> > > From: Thierry Reding <treding@xxxxxxxxxx>
> > > 
> > > The SOCTHERM's built-in throttling mechanism doesn't map well to the
> > > concept of a cooling device because it will automatically start to
> > > throttle when the programmed temperature threshold is crossed.
> > > 
> > > Remove the cooling device implementation and instead unconditionally
> > > program the throttling for the CPU and GPU thermal zones.
> > > 
> > > Signed-off-by: Thierry Reding <treding@xxxxxxxxxx>
> > > ---
> > 
> > [ ... ]
> > 
> > > +	ret = of_property_read_u32(np, "temperature-millicelsius",
> > > +				   &stc->temperature);
> > > +	if (ret < 0)
> > > +		goto err;
> > > +
> > > +	ret = of_property_read_u32(np, "hysteresis-millicelsius",
> > > +				   &stc->hysteresis);
> > > +	if (ret < 0)
> > > +		goto err;
> > > +
> > > +	stc->num_zones = of_count_phandle_with_args(np, "nvidia,thermal-zones",
> > > +						    NULL);
> > > +	if (stc->num_zones > 0) {
> > > +		struct device_node *zone;
> > > +		unsigned int i;
> > > +
> > > +		stc->zones = devm_kcalloc(ts->dev, stc->num_zones, sizeof(zone),
> > > +					  GFP_KERNEL);
> > > +		if (!stc->zones)
> > > +			return -ENOMEM;
> > > +
> > > +		for (i = 0; i < stc->num_zones; i++) {
> > > +			zone = of_parse_phandle(np, "nvidia,thermal-zones", i);
> > > +			stc->zones[i] = zone;
> > > +		}
> > > +	}
> > 
> > What is the connection between the temperature sensor and the hardware
> > limiter?
> > 
> > I mean, one hand there is the hardware limiter which is not connected to the
> > sensor neither a thermal zone and it could be self contained in a separate
> > driver. And then there is the temperature sensor.
> > 
> > The thermal zone phandle things connected with the throttling bindings
> > sounds like strange to me.
> > 
> > What prevents to split the throttling and the sensor into separate code?
> 
> Both the temperature sensor and the hardware throttle mechanism are part
> of the same IP block, so it would be quite difficult (and unnecessary)
> to split them into separate drivers.
> 
> The hardware throttler uses the temperature sensor's data to initiate
> throttling automatically when certain (programmable) temperature
> thresholds are reached.
> 
> The reason why we need to reference the thermal zone is because the
> registers needed to program the throttler are contained within the
> sensor group (which are effectively mapped to thermal zones).
> 
> I suppose there are a number of other ways how this could be described.
> The thermal zones could be extended with extra information about the
> throttling, or we could use just the sensor group ID instead of a full
> phandle to reference this.
> 
> I was sort of trying to keep things somewhat aligned with the concept of
> thermal zones and not rewrite the entire thing, but perhaps I should go
> back to the drawing board and think about whether there's an even better
> way to describe this in DT.

I've looked at the documentation in a bit more details and here's an
high-level overview of what SOCTHERM is.

We have four groups (CPU, GPU, MEM and PLLX), each of which can be
programmed at four different levels (each level is an identical set of
registers to program temperature thresholds, throttling and enable or
disable). For temperature thresholds an interrupt can be configured.

There's an additional "thermtrip" level, which only has a threshold
that, when reached, will cause an emergency, hardware-induced shutdown
of the system.

Any of the generic levels can be used in whatever way we want. The
convention currently is to program the thermal zone trip points using
level 0. So for each group we create a thermal zone and level 0 for each
of the zones is programmed with the low and high thresholds for a given
trip point.

Currently we also use levels 1 and 2 to program the "light" and "heavy"
throttling "indicators". These will in turn be used to generate outputs
to the actual throttling mechanisms (CPU-light, CPU-heavy, GPU-light and
GPU-heavy).

There are a few other things that can be done, but I don't fully
understand how they would be useful and I don't think they've ever been
used, so I'll skip those for now.

Given the above, the thermal zone trip points are fairly clear. They are
fine as they are implemented. For the throttling mechanism we could do
something that maps more explicitly to the above groups and levels
concepts, but I think that could easily conflict with the trip points
programming, so keeping with the current conventions seems good and
designing the device tree bindings accordingly would help avoid any
conflicts.

So I think keeping the throttle-cfgs node is a good fit. We don't really
need to establish a connection between the thermal zone and the throttle
mechanism, though. We can derive the level from the indicator (light or
heavy) and for the group we only need an ID. The reason why I proposed a
link to the thermal-zone is because that thermal zone contains that ID
already, but we could equally well just add an nvidia,group property or
something along those lines so we know which group to use rather than
try and get it from a thermal zone.

I'll revise the bindings to see if I can come up with something.

Thierry

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [ARM Kernel]     [Linux ARM]     [Linux ARM MSM]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux