On 08/09/2021 03:21, Bjorn Andersson wrote:
On Mon 09 Aug 10:26 PDT 2021, Akhil P Oommen wrote:
On 8/9/2021 9:48 PM, Caleb Connolly wrote:
On 09/08/2021 17:12, Rob Clark wrote:
On Mon, Aug 9, 2021 at 7:52 AM Akhil P Oommen
<akhilpo@xxxxxxxxxxxxxx> wrote:
[..]
I am a bit confused. We don't define a power domain for gpu in dt,
correct? Then what exactly set_opp do here? Do you think this usleep is
what is helping here somehow to mask the issue?
The power domains (for cx and gx) are defined in the GMU DT, the OPPs in
the GPU DT. For the sake of simplicity I'll refer to the lowest
frequency (257000000) and OPP level (RPMH_REGULATOR_LEVEL_LOW_SVS) as
the "min" state, and the highest frequency (710000000) and OPP level
(RPMH_REGULATOR_LEVEL_TURBO_L1) as the "max" state. These are defined in
sdm845.dtsi under the gpu node.
The new devfreq behaviour unmasks what I think is a driver bug, it
inadvertently puts much more strain on the GPU regulators than they
usually get. With the new behaviour the GPU jumps from it's min state to
the max state and back again extremely rapidly under workloads as small
as refreshing UI. Where previously the GPU would rarely if ever go above
342MHz when interacting with the device, it now jumps between min and
max many times per second.
If my understanding is correct, the current implementation of the GMU
set freq is the following:
- Get OPP for frequency to set
- Push the frequency to the GMU - immediately updating the core clock
- Call dev_pm_opp_set_opp() which triggers a notify chain, this winds
up somewhere in power management code and causes the gx regulator level
to be updated
Nope. dev_pm_opp_set_opp() sets the bandwidth for gpu and nothing else. We
were using a different api earlier which got deprecated -
dev_pm_opp_set_bw().
On the Lenovo Yoga C630 this is reproduced by starting alacritty and if
I'm lucky I managed to hit a few keys before it crashes, so I spent a
few hours looking into this as well...
As you say, the dev_pm_opp_set_opp() will only cast a interconnect vote.
The opp-level is just there for show and isn't used by anything, at
least not on 845.
Further more, I'm missing something in my tree, so the interconnect
doesn't hit sync_state, and as such we're not actually scaling the
buses. So the problem is not that Linux doesn't turn on the buses in
time.
So I suspect that the "AHB bus error" isn't saying that we turned off
the bus, but rather that the GPU becomes unstable or something of that
sort.
Lastly, I reverted 9bc95570175a ("drm/msm: Devfreq tuning") and ran
Aquarium for 20 minutes without a problem. I then switched the gpu
devfreq governor to "userspace" and ran the following:
while true; do
echo 257000000 > /sys/class/devfreq/5000000.gpu/userspace/set_freq
echo 710000000 > /sys/class/devfreq/5000000.gpu/userspace/set_freq
done
It took 19 iterations of this loop to crash the GPU.
So the problem doesn't seem to be Rob's change, it's just that prior to
it the chance to hitting it is way lower. Question is still what it is
that we're triggering.
Do the opp-levels in DTS represent how the hardware behaves? If so then it does just
appear to be that whatever is responsible for scaling the GX rail voltage
has no time limits and will attempt to switch the regulator between min/max
voltage as often as we tell it to which is probably not something the hardware expected.
Regards,
Bjorn
--
Kind Regards,
Caleb (they/them)