On 2018-07-19 02:30 PM, Alex Deucher wrote: > On Thu, Jul 19, 2018 at 1:07 PM, Andrey Grodzovsky > <Andrey.Grodzovsky at amd.com> wrote: >> >> >> On 07/19/2018 12:59 PM, Michel Dänzer wrote: >>> >>> On 2018-07-19 06:53 PM, Andrey Grodzovsky wrote: >>>> >>>> >>>> On 07/19/2018 12:47 PM, Michel Dänzer wrote: >>>>> >>>>> On 2018-07-19 06:33 PM, Andrey Grodzovsky wrote: >>>>>> >>>>>> On 07/19/2018 11:39 AM, Ville Syrjälä wrote: >>>>>>> >>>>>>> On Thu, Jul 19, 2018 at 11:19:56AM -0400, Andrey Grodzovsky wrote: >>>>>>>> >>>>>>>> Problem: >>>>>>>> FB is still not unpinned during the first run of amdgpu_bo_evict_vram >>>>>>>> and so it's left for the second run, but during second run the SDMA >>>>>>>> for >>>>>>>> moving buffer around already disabled and you have to do >>>>>>>> it with CPU, but FB is not in visible VRAM and hence the eviction >>>>>>>> failure >>>>>>>> leading later to resume failure. >>>>>>>> >>>>>>>> Fix: >>>>>>>> When DAL in use get a pointer to FB from crtc->primary->state rather >>>>>>>> then from crtc->primary which is not set for DAL since it supports >>>>>>>> atomic KMS. >>>>>>>> >>>>>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065 >>>>>>>> Fixes e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic >>>>>>>> drivers >>>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com> >>>>>>>> --- >>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++- >>>>>>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>>>>>> >>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>> index 709e4a3..dd9ebf7 100644 >>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>> @@ -2642,7 +2642,8 @@ int amdgpu_device_suspend(struct drm_device >>>>>>>> *dev, bool suspend, bool fbcon) >>>>>>>> /* unpin the front buffers and cursors */ >>>>>>>> list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) >>>>>>>> { >>>>>>>> struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc); >>>>>>>> - struct drm_framebuffer *fb = crtc->primary->fb; >>>>>>>> + struct drm_framebuffer *fb = >>>>>>>> amdgpu_device_has_dc_support(adev) ? >>>>>>>> + crtc->primary->state->fb : crtc->primary->fb; >>>>>>> >>>>>>> So apparently you haven't yet turned off the planes here. If I'm >>>>>>> reading things right amdgpu_device_ip_suspend() should end up doing >>>>>>> that through drm_atomic_helper_suspend(). So it looks like like now >>>>>>> you'll end up unpinning the same bos twice. Doesn't that mess up >>>>>>> some kind of refcount or something? >>>>>> >>>>>> amdgpu_bo_unpin has a guard against that, amdgpu_bo_unreserve is less >>>>>> clear. >>>>> >>>>> BO reservation shouldn't an issue here, BOs are only reserved for a >>>>> short time around (un)pinning them. >>>>> >>>>> >>>>>>> To me it would seem better to susped the display before trying >>>>>>> to evict the bos. >>>>>> >>>>>> Yea, i was aware of that and indeed DAL shouldn't rely on the code in >>>>>> amdgpu_device_suspend to unpin >>>>>> front buffer and cursor since the atomic helper should do it. Problem >>>>>> is >>>>>> that during amdgpu_device_ip_suspend >>>>>> the SDMA engine gets suspended too, so you have to embed another >>>>>> eviction in between, after display is suspended but before >>>>>> SDMA and this forces ordering between them which kind of already in >>>>>> place (amd_ip_block_type) but still it's an extra constrain. >>>>> >>>>> Ville's point (which I basically agree with) is that the display >>>>> hardware should be turned off before evicting VRAM the first time, in >>>>> which case no second eviction should be necessary (for this purpose). >>>> >>>> Display HW is turned off as part of all IPs in a loop inside >>>> amdgpu_device_ip_suspend. >>>> Are you suggesting to extract the display HW turn off from inside >>>> amdgpu_device_ip_suspend and place it >>>> before the first call to amdgpu_bo_evict_vram ? >>> >>> In a nutshell, yes. >>> >>> Or maybe it would be easier to move the amdgpu_bo_evict_vram call down >>> to somewhere called from amdgpu_device_ip_suspend? >> >> >> I can move the BEFORE and AFTER calls to amdgpu_bo_evict_vram inside >> amdgpu_device_ip_suspend >> such that the first one is called AFTER display is shut off, while the >> second in the very end of the function. >> I am just not sure what's gonna be the side effect of evicting after bunch >> of blocks (such as GMC) are already disabled. > > How about something like the attached patches? Basically split the ip > suspend sequence in two like we do for resume. > Patches are Acked-by: Harry Wentland <harry.wentland at amd.com> Harry > Alex >