Re: Expecting to revert commit 55285e21f045 "fbdev/efifb: Release PCI device ..."

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Good morning guys,

first of all get better soon Linus.

I'm unfortunately not the best expert for runtime power management (Alex) nor display (Harry), but from the lack of their response I guess that they are already on vacation. So maybe take everything I explain here with a grain of salt.

Then for the background we have two separate power management features here which doesn't seem to work as they should.

The first buggy one is runtime power management, which is what commit 55285e21f045 surfaces. My educated guess is that the now corrected reference counting turns of the GPU before userspace has a chance to send a signal to the monitor to turn of it's backlight. Double checking the code I can see the correct calls to pm_runtime_get_*() and pm_runtime_put_*() in amdgpu_dm_atomic_commit_tail(), but to be honest that function seems to be quite a mess.

A trace of what exactly happens during PM autosuspend might help here. Daniel do you know any tracepoint for that?

Then we have DPMS, which is basically the way of telling the monitor to shut of it's backlight. When this doesn't work as expected (e.g. you need *two* cycles) then it can as well be that userspace is not sending the right command.

When you use X you could double check with "xset dpms force off" and "xset dpms force suspend". At least with my monitor it turns of the backlight in both cases, but maybe your hardware behaves differently.

Regards,
Christian.

Am 20.12.21 um 23:21 schrieb Linus Torvalds:
[ Adding back in more amd people and the amd list, the people Daniel
added seem to have gotten lost again, but I think people at least saw
my original report thanks to Daniel ]

With "amdgpu.runpm=0", things are better, but not perfect. With that I
can lock the screen, and it has to go through *two* cycles of "No
signal, turning off", but on the second cycle it does finally work.

This was exposed by commit 55285e21f045 ("fbdev/efifb: Release PCI
device's runtime PM ref during FB destroy"), probably because that
made runtime PM actually potentially work, but it is then broken on
amdgpu.

Absolutely nothing odd in my setup. Two monitors, one GPU. PCI ID
1002:67df rev e7, subsystem ID 1da2:e353.

I'd expect pretty much any amdgpu person to see this.

On Mon, Dec 20, 2021 at 2:04 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
Note: on my machine, I get that

    amdgpu 0000:49:00.0: amdgpu: Using BACO for runtime pm

so maybe the other possible runtime pm models (ARPX and BOCO) are ok,
and it's only that BACO case that is broken.
Hmm. The *documentation* says:

     PX runtime pm
         2 = force enable with BAMACO,
         1 = force enable with BACO,
         0 = disable,
         -1 = PX only default

but the code actually makes anything != 0 enable it, except on VEGA20
and ARCTURUS, where it needs to be positive.

My card is apparently "POLARIS10", whatever that means, which means
that any non-zero value of amdgpu_runtime_pm will enable runtime PM as
long as "amdgpu_device_supports_baco()" is true. Which it is.

Whatever. Now I'm just kwetching about the documentation not matching
what I see the code doing, which is never a great sign when things
don't work.

               Linus




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux