On Tue, 2023-11-14 at 08:22 -0800, Teres Alexis, Alan Previn wrote: > When suspending, add a timeout when calling > intel_gt_pm_wait_for_idle else if we have a leaked > wakeref (which would be indicative of a bug elsewhere > in the driver), driver will at exit the suspend-resume > cycle, after the kernel detects the held reference and > prints a message to abort suspending instead of hanging > in the kernel forever which then requires serial connection > or ramoops dump to debug further. NOTE: this patch originates from Patch#3 of this other series https://patchwork.freedesktop.org/series/121916/ (rev 5 and prior) and was decided to be moved out as its own patch since this patch is trying to improve general debuggability as opposed to resolving that bug being resolved in above series. alan:snip > +int intel_wakeref_wait_for_idle(struct intel_wakeref *wf, int timeout_ms) > { > - int err; > + int err = 0; > > might_sleep(); > > - err = wait_var_event_killable(&wf->wakeref, > - !intel_wakeref_is_active(wf)); > + if (!timeout_ms) > + err = wait_var_event_killable(&wf->wakeref, > + !intel_wakeref_is_active(wf)); > + else if (wait_var_event_timeout(&wf->wakeref, > + !intel_wakeref_is_active(wf), > + msecs_to_jiffies(timeout_ms)) < 1) > + err = -ETIMEDOUT; > + alan: paraphrasing feedback from Tvrtko on the originating series this patch: it would be good idea to add error-injection into this timeout to ensure we dont have any other subsytem that could inadvertently leak an rpm wakeref (and catch such bugs in future pre-merge).