On 11/26/2024 7:50 PM, Mario Limonciello wrote: >>>> Based on https://gitlab.freedesktop.org/drm/amd/-/issues/3781 >>>> >>>> What if this callback takes care only of suspend case and leaves the >>>> hibernate case to dpm_prepare callback? >>> >>> Then hibernate would fail under memory pressure. >>> >>> My take is this failure with hibernate is a userspace problem (whether >>> userspace decides to freeze the tasks or not). I think it's better that >>> we /try/ to do the eviction and notify them if userspace should be >>> changed. >>> >> >> Hmm, the logic is kind of inconsistent now. >> >> For dGPUs, evict is required for s0ix, s3, s4 and attempted twice. >> For APUs, evict is required for s4, but attempted only once. > > Well the logic was inconsistent before already. > > But FWIW it will still be attempted a second time for APUs. > The call sequence is: > > 1) PM notifier > amdgpu_device_pm_notifier() > ->amdgpu_device_evict_resources() > > Eviction for dGPU and APU in S4. > > 2) Prepare callback > amdgpu_pmops_prepare() > ->amdgpu_device_prepare() > ->->amdgpu_device_evict_resources() > > Eviction for dGPU only. > > 3) Suspend callback > amdgpu_pmops_freeze() > ->amdgpu_device_suspend() > ->->amdgpu_device_evict_resources() > > Eviction for dGPU and APU in S4. > > So yes it's incongruent that prepare() doesn't include APU S4, but the > problem is you won't know at prepare() time whether it's S4 AFAICT. > > Or is there way to tell at prepare() for S4? > Now this notifier comes in before that and tells that right? BTW, I didn't realize that there is a max of 3 attempts now (didn't notice it being called in suspend also). Thanks, Lijo >> >> Thanks, >> Lijo >> >>>> >>>> Thanks, >>>> Lijo >>>>> + case PM_SUSPEND_PREPARE: >>>>> + r = amdgpu_device_evict_resources(adev); >>>>> + adev->in_s4 = false; >>>>> + /* >>>>> + * This is considered non-fatal at thie time because >>>>> + * amdgpu_device_prepare() will also evict resources. >>>>> + * See https://gitlab.freedesktop.org/drm/amd/-/issues/3781 >>>>> + */ >>>>> + if (r) >>>>> + drm_warn(adev_to_drm(adev), "Failed to evict resources, >>>>> freeze active processes if problems occur\n"); >>>>> + break; >>>>> + } >>>>> + >>>>> + return NOTIFY_DONE; >>>>> +} >>>>> + >>>>> /** >>>>> * amdgpu_device_prepare - prepare for device suspend >>>>> * >>>> >>> >> >