amdgpu multi monitor - clock, heat and power problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have 2 Gigabyte RX580's in my desktop workstation.

I'm running Arch Linux with KDE Plasma on the 5.0.6 kernel.

 

The cards themselves work fine, except,

I have two 1080p HDMI monitors plugged into one of these cards.

One in a native HDMI port, one in a passive DVI->HDMI adapter.

 

This causes the following problem for idle usage:

 

1. Memory clock is effectively locked at 200Mhz always

2. Core clock is constantly at high frequency P-state

3. Temperatures are increased

4. Power consumption is increased (significantly)

5. PCI bus is always at full speed

6. Forcing core clock to 300Mhz, uses a higher than usual voltage

 

Below is an excerpt from the rocm-smi utility for the automatic defaults

(I have omitted overclock and powercap values for formatting purposes)

 

 

2 Monitors connected to GPU 0, No monitors connected to GPU 1

ROCm System Management Interface

===============================================================================

GPU Temp AvgPwr SCLK MCLK PCLK Fan Perf GPU%

0 44.0c 36.193W 1145Mhz 2000Mhz 8.0GT/s, x16 40.0% auto 0%

1 37.0c 28.104W 300Mhz 300Mhz 2.5GT/s, x8 0.0% auto 0%

===============================================================================

End of ROCm SMI Log

 

GPU 0 is idle and yet running SCLK and MCLK at unnecessary power levels

GPU 1 is truly idle

Regarding GPU 0 temperature, I have actually setup a daemon to run the fan at a consistent rate to prevent it from constantly peaking.

 

-------------------------------------------------------------------------------

 

1 Monitors connected to GPU 0, No monitors connected to GPU 1

ROCm System Management Interface

===============================================================================

GPU Temp AvgPwr SCLK MCLK PCLK Fan Perf GPU%

0 36.0c 28.103W 300Mhz 300Mhz 2.5GT/s, x8 0.0% auto 0%

1 37.0c 28.104W 300Mhz 300Mhz 2.5GT/s, x8 0.0% auto 0%

===============================================================================

 

2 Monitors connected to GPU 0, No monitors connected to GPU 1

 

2 Monitors connected to GPU 0, No monitors connected to GPU 1

ROCm System Management Interface

===============================================================================

GPU Temp AvgPwr SCLK MCLK PCLK Fan Perf GPU%

0 44.0c 31.086W 300Mhz 2000Mhz 2.5GT/s, x8 40.0% low 0%

1 37.0c 28.104W 300Mhz 300Mhz 2.5GT/s, x8 0.0% low 0%

===============================================================================

 

Peculiarly even with low power state forced, the GPU runs at a voltage (950mV) in excess of what is required for 300Mhz (750mV)

 

 

===============================================================================

cat /sys/kernel/debug/dri/0/amdgpu_pm_info jupiter: Mon Apr 8 21:57:29 2019

 

Clock Gating Flags Mask: 0x3fbcf

Graphics Medium Grain Clock Gating: On

Graphics Medium Grain memory Light Sleep: On

Graphics Coarse Grain Clock Gating: On

Graphics Coarse Grain memory Light Sleep: On

Graphics Coarse Grain Tree Shader Clock Gating: Off

Graphics Coarse Grain Tree Shader Light Sleep: Off

Graphics Command Processor Light Sleep: On

Graphics Run List Controller Light Sleep: On

Graphics 3D Coarse Grain Clock Gating: Off

Graphics 3D Coarse Grain memory Light Sleep: Off

Memory Controller Light Sleep: On

Memory Controller Medium Grain Clock Gating: On

System Direct Memory Access Light Sleep: Off

System Direct Memory Access Medium Grain Clock Gating: On

Bus Interface Medium Grain Clock Gating: Off

Bus Interface Light Sleep: On

Unified Video Decoder Medium Grain Clock Gating: On

Video Compression Engine Medium Grain Clock Gating: On

Host Data Path Light Sleep: On

Host Data Path Medium Grain Clock Gating: On

Digital Right Management Medium Grain Clock Gating: Off

Digital Right Management Light Sleep: Off

Rom Medium Grain Clock Gating: On

Data Fabric Medium Grain Clock Gating: Off

 

GFX Clocks and Power:

2000 MHz (MCLK)

300 MHz (SCLK)

600 MHz (PSTATE_SCLK)

1000 MHz (PSTATE_MCLK)

950 mV (VDDGFX)

31.14 W (average GPU)

 

GPU Temperature: 43 C

GPU Load: 0 %

 

UVD: Disabled

 

VCE: Disabled

===============================================================================

 

 

Using amdgpu.ppfeaturemask=0xffffffff I am able to work around all of the above issues, but requires me to manually set idle and performance clock speeds as required. 300mhz memory and core drive 2 HDMI 1080p displays just fine.

But this leads to screen tearing/green visible artefacting when *changing* core clock speeds. Like there is a synchronization issue. But when running at a fixed speed, all is well.

 

The temperatures alone show that power is being wasted.

 

I have a UPS that can reasonably accurately (16W steps) measure system power consumption. At idle with default settings letting the kernel and gpu's deal with things themselves I sometimes read ~196W idle power!

 

2 Monitors (auto) -> 196W Idle

2 Monitors (low) -> 160W Idle

2 Monitors (Force 300) -> 112-128W Idle

1 monitor -> 96-128W Idle

 

Even if my UPS isn't giving the exact true values, that delta is concerning.

 

It is a longstanding issue which has been bugging me for a while now.
I'm not sure if it's come up yet or why this has been going on for so long.

But it should really be fixed as the issue carries a quite large associated thermal and power burden.

 

I have tried poking through the source code to figure this out, but no luck. Have I missed something? Is there a problem synchronizing display VSYNC on clock changes? Why is this happening? It's clearly not the right behaviour.

 

What can be done to fix this? Can I help?

 

Best Regards

Rigo Reddig

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux