On Tuesday, 22 November 2022 11:41:29 CET Tvrtko Ursulin wrote: > > On 21/11/2022 23:19, Janusz Krzysztofik wrote: > > Hi Andrzej, > > > > Thanks for providing your R-b. > > > > On Monday, 21 November 2022 18:40:51 CET Andrzej Hajda wrote: > >> On 21.11.2022 15:56, Janusz Krzysztofik wrote: > >>> Commit b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work > >>> with GuC") extended the API of intel_gt_retire_requests_timeout() with an > >>> extra argument 'remaining_timeout', intended for passing back unconsumed > >>> portion of requested timeout when 0 (success) is returned. However, when > >>> request retirement happens to succeed despite an error returned by a call > >>> to dma_fence_wait_timeout(), that error code (a negative value) is passed > >>> back instead of remaining time. If we then pass that negative value > >>> forward as requested timeout to intel_uc_wait_for_idle(), an explicit BUG > >>> will be triggered. > > Right, AFAICT a GEM_BUG_ON in debug builds, but in production builds > negative timeout will get passed along all the way to schedule_timeout > where error and call trace will be dumped. So fix appears warranted indeed. > > >>> If request retirement succeeds but an error code is passed back via > >>> remaininig_timeout, we may have no clue on how much of the initial timeout > >>> might have been left for spending it on waiting for GuC to become idle. > >>> OTOH, since all pending requests have been successfully retired, that > >>> error code has been already ignored by intel_gt_retire_requests_timeout(), > >>> then we shouldn't fail. > >>> > >>> Assume no more time has been left on error and pass 0 timeout value to > >>> intel_uc_wait_for_idle() to give it a chance to return success if GuC is > >>> already idle. > >>> > >>> v3: Don't fail on any error passed back via remaining_timeout. > >>> > >>> v2: Fix the issue on the caller side, not the provider. > >>> > >>> Fixes: b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work > > with GuC") > >>> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@xxxxxxxxxxxxxxx> > >>> Cc: stable@xxxxxxxxxxxxxxx # v5.15+ > >> > >> Reviewed-by: Andrzej Hajda <andrzej.hajda@xxxxxxxxx> > > > > While still open for comments from others, I'm now looking for potential > > committer. > > Both patches are considered good to go? Yes, I hope so. The actual diff of 2/2 v3 has received R-b from Andrzej still under discussion of v2, then I decided to add that tag to v3. Andrzej, can you please respond to 2/2 v3 and confirm your R-b applies? Thanks, Janusz > > Regards, > > Tvrtko > > > > > Thanks, > > Janusz > > > > > >> > >> Regards > >> Andrzej > >> > >>> --- > >>> drivers/gpu/drm/i915/gt/intel_gt.c | 9 +++++++-- > >>> 1 file changed, 7 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/ > > intel_gt.c > >>> index b5ad9caa55372..7ef0edb2e37cd 100644 > >>> --- a/drivers/gpu/drm/i915/gt/intel_gt.c > >>> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c > >>> @@ -677,8 +677,13 @@ int intel_gt_wait_for_idle(struct intel_gt *gt, long > > timeout) > >>> return -EINTR; > >>> } > >>> > >>> - return timeout ? timeout : intel_uc_wait_for_idle(>->uc, > >>> - > > remaining_timeout); > >>> + if (timeout) > >>> + return timeout; > >>> + > >>> + if (remaining_timeout < 0) > >>> + remaining_timeout = 0; > >>> + > >>> + return intel_uc_wait_for_idle(>->uc, remaining_timeout); > >>> } > >>> > >>> int intel_gt_init(struct intel_gt *gt) > >> > >> > > > > > > > > >