Comment # 54
on bug 69723
from Martin Andersson
I have a 6950 and I'm seeing the exact same things as Alexandre, random hangs that completely lockup the machine. Can't ssh into it and nothing is printed to the logs, the only thing that works is a power cycle. If I disable dpm the machine is stable. I have run with dpm since 3.11 and I have had the occasional lockup, maybe one every two weeks. But I started playing some more games recently and noticed that the lockups became much more frequent. So I decided to investigate. The method I use to trigger a lockup is to run GpuTest in loop, with a 10 seconds sleep after each run. I do this to trigger power level switches. The arguments to GpuTest is /test=plot3d /benchmark /benchmark_duration_ms=10000 /no_scorebox. At the same time I run piglit quick.tests in a loop, I later found out that the piglit tests are not essential to get lockups but I kept doing them for consistency. 20 of these tests have resulted in a lockup, of these the longest running one lasted 80 minutes and shortest 3 minutes with an average of 23 minutes. The tests that didn't cause lockups either had dpm completely disabled or only certain features, which features are described below. If I run GpuTest constantly, without the sleep and longer benchmark duration, I don't get any lockups (I have done several long runs, with longest being over six hours). I also tried to find a good commit. I started with 7ad8d0687bb5030c3328bc7229a3183ce179ab25 (drm/radeon/dpm: re-enable state transitions for Cayman) + the gcc fixes, but I get lockups on that commit as well. I checked out 3.13-rc2 and started disabling features in ni_dpm_init. I disabled the following things without any improvement. I reenabled each feature after I had tested it and cold booted the machine. eg_pi->smu_uvd_hs pi->mvdd_control eg_pi->vddci_control pi->gfx_clock_gating pi->mg_clock_gating pi->mgcgtssm pi->dynamic_pcie_gen2 pi->thermal_protection pi->display_gap pi->dcodt pi->ulps eg_pi->abm eg_pi->mcls eg_pi->light_sleep eg_pi->memory_transition ni_pi->cac_weights->enable_power_containment_by_default ni_pi->use_power_boost_limit pi->sclk_ss eg_pi->pcie_performance_request, was already false so I didn't test it. I noticed that pi->mvdd_control wasn't set, is that normal? I don't get any lockups with pi->voltage_control disabled, but I also don't get any power level switches. If I set eg_pi->dynamic_ac_timing to false my machine lockups somewhere in the boot process, I haven't looked into that any deeper. However if I set pi->dynamic_ss to false the lockups disappear, it also works with dynamic_ss set to true and pi->mclk_ss set to false. So it seems, at least for me, it has something to do with mclk together with power level switches. I'm not sure what to test next, but one thing might be to try to remove the performance power level 2, so that it could only switch between 0 and 1. But I haven't figured out how to accomplish that yet.
You are receiving this mail because:
- You are the assignee for the bug.
_______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel