Re: [REGRESSION] Laptop with Ryzen 4600H fails to resume video since 5.17.4 (works 5.17.3)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello

I've checked with 5.18 the problem is still there.
Interestingly, I tried to revert the commit but it was rejected because of the 
change in the test from:
        if (!adev->in_s0ix) 
to:
      if (amdgpu_acpi_should_gpu_reset(adev))

in amdgpu_pmops_suspend.

I fixed the rejection, keeping shoud_gpu_reset, but it still fails.
Then I changed to restore test of in_s0ix as it was in 5.17, and it works.
I tried with a call to amd_gpu_asic_reset without testing at all in_s0ix, it 
works.

Therefore, my APU wants a reset in amdgpu_pmops_suspend.

By curiosity, I tried to do the reset in amdgpu_pmops_suspend_noirq as was 
intended in 5.18 original code, commenting out the test of 
amdgpu_acpi_should_gpu_reset(adev) (since this APU wants a reset).
This does not work, I got the Fence timeout errors or freezes.

If I leave  noirq function unchanged (original 5.18 code), and just add a 
reset in suspend() as was done in 5.17, it works.

Therefore, my GPU does NOT want to be reset in noirq, the reset must be in 
suspend.

In other words, I modified amdgpu_pmops_suspend (partial revert) like this and 
this works on my laptop:

static int amdgpu_pmops_suspend(struct device *dev)
{
	struct drm_device *drm_dev = dev_get_drvdata(dev);
	struct amdgpu_device *adev = drm_to_adev(drm_dev);
+	int r;

	if (amdgpu_acpi_is_s0ix_active(adev))
		adev->in_s0ix = true;
	else
		adev->in_s3 = true;
-	return amdgpu_device_suspend(drm_dev, true);
+	r = amdgpu_device_suspend(drm_dev, true);
+	if (r)
+		return r;
+	if (!adev->in_s0ix)
+		return amdgpu_asic_reset(adev);
	return 0;
}

static int amdgpu_pmops_suspend_noirq(struct device *dev)
{
	struct drm_device *drm_dev = dev_get_drvdata(dev);
	struct amdgpu_device *adev = drm_to_adev(drm_dev);

	if (amdgpu_acpi_should_gpu_reset(adev))
		return amdgpu_asic_reset(adev);

	return 0;
}

I don't know if other APU want a reset, in the same context, and how to 
differentiate all the cases, so I cannot go further, but I can test patches if 
needed.

CC

Le mercredi 18 mai 2022, 08:37:27 CEST Thorsten Leemhuis a écrit :
> On 18.05.22 07:54, Kai-Heng Feng wrote:
> > On Wed, May 18, 2022 at 1:52 PM Thorsten Leemhuis
> > 
> > <regressions@xxxxxxxxxxxxx> wrote:
> >> On 17.05.22 19:37, casteyde.christian@xxxxxxx wrote:
> >>> I've tryied to revert the offending commit on 5.18-rc7 (887f75cfd0da
> >>> ("drm/amdgpu: Ensure HDA function is suspended before ASIC reset"), and
> >>> the problem disappears so it's really this commit that breaks.
> >> 
> >> In that case I'll update the regzbot status to make sure it's visible as
> >> regression introduced in the 5.18 cycle:
> >> 
> >> #regzbot introduced: 887f75cfd0da
> >> 
> >> BTW: obviously would be nice to get this fixed before 5.18 is released
> >> (which might already happen on Sunday), especially as the culprit
> >> apparently was already backported to stable, but I guess that won't be
> >> easy...
> >> 
> >> Which made me wondering: is reverting the culprit temporarily in
> >> mainline (and reapplying it later with a fix) a option here?
> > 
> > It's too soon to call it's the culprit.
> 
> Well, sure, the root-cause might be somewhere else. But from the point
> of kernel regressions (and tracking them) it's the culprit, as that's
> the change that triggers the misbehavior. And that's how Linus
> approaches these things as well when it comes to reverting to fix
> regressions -- and he even might...
> 
> > The suspend on the system
> > doesn't work properly at the first place.
> 
> ...ignore things like this, as long as a revert is unlikely to cause
> more damage than good.
> 
> Ciao. Thorsten
> 
> >> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> >> 
> >> P.S.: As the Linux kernel's regression tracker I deal with a lot of
> >> reports and sometimes miss something important when writing mails like
> >> this. If that's the case here, don't hesitate to tell me in a public
> >> reply, it's in everyone's interest to set the public record straight.








[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux