Re: [PATCH v2 1/1] arm64: dts: mediatek: mt8186: Increase CCI frequency

AngeloGioacchino Del Regno <angelogioacchino.delregno@xxxxxxxxxxxxx> · Wed, 10 Jan 2024 12:10:04 +0100

Il 10/01/24 06:44, Chun-Jen Tseng (曾俊仁) ha scritto:
On Wed, 2023-11-29 at 14:22 +0100, AngeloGioacchino Del Regno wrote:
Il 14/09/23 14:10, Mark Tseng ha scritto:
The original CCI OPP table's lowest frequency 500 MHz is too low
and causes
system stalls. Increase the frequency range to 1.05 GHz ~ 1.4 GHz
and adjust
the OPPs accordingly.

Fixes: 32dfbc03fc26 ("arm64: dts: mediatek: mt8186: Add CCI node
and CCI OPP table")

Signed-off-by: Mark Tseng <chun-jen.tseng@xxxxxxxxxxxx>

You ignored my comment [1] on the v1 of this patch.

Besides, I think that you should at least keep the 500MHz frequency
for a
sleep-only/idle OPP to save power.

It would also be helpful to understand why you chose this new
frequency range,
so if you can, please put some numbers in the commit description,
showing the
stall in terms of requested BW vs actual BW (as I'd imagine that a 2x
increase
in CCI frequency means that we need *twice* the bandwidth compared to
what we
have for the workloads that are stalling the system).

Hi AngeloGioacchino Del Regno,

Thanks your reminder this issue. After ajdustment CCI OPP, we also do
power test benchmark and the result is PASS.

Sorry but `PASS` is not a number; I actually wanted a before and after power
consumption measurement in microwatts.

The original CCI table has stall issue.  When the Big CPU frequency set
on 2.05G and CCI frequency keep on 500MHz then run CTS MediaTest will
system stall then trigger watchdog reset SoC.

The CPU and CCI frequency setting are not in the same driver. So it
will have timing issue cause CPU stall side effect.

Are you trying to fix a frequency setting delay/desync with raising the
frequency of the CCI?
That's not the right way of doing it.

Asserting that we have a timing issue because the two frequency settings
are not done by the same driver is borderline wrong - but anyway - if there
is a frequency setting timing issue because of the interaction between the
two drivers (cpufreq/ccifreq), the right way of eliminating the stall is to
actually solve the root cause of that.

I'm insisting on this because if there's a "timing issue" this means that
even though the "base" CCI frequency is higher, during a scaling up operation
depending on how much the CCI gets flooded, you might *either*:
 - Have this same stall issue again, and/or
 - Have performance issues/drops while waiting for the CCI to scale up.

Even though you may not (or may...) get a stall issue again with this change,
you will surely get (very short) temporary performance drops during scaling up.

....and this is why your CCI frequency increase solution does *not* resolve
this issue, but only partially mitigates it.

That should get solved, not partially mitigated.

Besides that, can you please tell me how to replicate the stall issue, making
me able to better understand what's going on here?

Regards,
Angelo

BRs,

Mark Tseng

[1]:
https://lore.kernel.org/all/799325f5-29b5-f0c0-16ea-d47c06830ed3@xxxxxxxxxxxxx/

Regards,
Angelo