Re: [PATCH v2] drm/xe/ufence: Flush xe ordered_wq in case of ufence timeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/25/2024 09:03, Nirmoy Das wrote:
On 10/24/2024 6:32 PM, Jani Nikula wrote:
On Thu, 24 Oct 2024, Nirmoy Das <nirmoy.das@xxxxxxxxx> wrote:
Flush xe ordered_wq in case of ufence timeout which is observed
on LNL and that points to the recent scheduling issue with E-cores.

This is similar to the recent fix:
commit e51527233804 ("drm/xe/guc/ct: Flush g2h worker in case of g2h
response timeout") and should be removed once there is E core
scheduling fix.

v2: Add platform check(Himal)
     s/__flush_workqueue/flush_workqueue(Jani)

Cc: Badal Nilawar <badal.nilawar@xxxxxxxxx>
Cc: Jani Nikula <jani.nikula@xxxxxxxxx>
Cc: Matthew Auld <matthew.auld@xxxxxxxxx>
Cc: John Harrison <John.C.Harrison@xxxxxxxxx>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@xxxxxxxxx>
Cc: Lucas De Marchi <lucas.demarchi@xxxxxxxxx>
Cc: <stable@xxxxxxxxxxxxxxx> # v6.11+
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2754
Suggested-by: Matthew Brost <matthew.brost@xxxxxxxxx>
Signed-off-by: Nirmoy Das <nirmoy.das@xxxxxxxxx>
Reviewed-by: Matthew Brost <matthew.brost@xxxxxxxxx>
---
  drivers/gpu/drm/xe/xe_wait_user_fence.c | 14 ++++++++++++++
  1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_wait_user_fence.c b/drivers/gpu/drm/xe/xe_wait_user_fence.c
index f5deb81eba01..78a0ad3c78fe 100644
--- a/drivers/gpu/drm/xe/xe_wait_user_fence.c
+++ b/drivers/gpu/drm/xe/xe_wait_user_fence.c
@@ -13,6 +13,7 @@
  #include "xe_device.h"
  #include "xe_gt.h"
  #include "xe_macros.h"
+#include "compat-i915-headers/i915_drv.h"
Sorry, you just can't use this in xe core. At all. Not even a little
bit. It's purely for i915 display compat code.

If you need it for the LNL platform check, you need to use:

	xe->info.platform == XE_LUNARLAKE

Will do that. That macro looked odd but I didn't know a better way.

Although platform checks in xe code are generally discouraged.

This issue unfortunately depending on platform instead of graphics IP.
But isn't this issue dependent upon the CPU platform not the graphics platform? As in, a DG2 card plugged in to a LNL host will also have this issue. So testing any graphics related value is technically incorrect.

John.



Thanks,

Nirmoy

BR,
Jani.



  #include "xe_exec_queue.h"
static int do_compare(u64 addr, u64 value, u64 mask, u16 op)
@@ -155,6 +156,19 @@ int xe_wait_user_fence_ioctl(struct drm_device *dev, void *data,
  		}
if (!timeout) {
+			if (IS_LUNARLAKE(xe)) {
+				/*
+				 * This is analogous to e51527233804 ("drm/xe/guc/ct: Flush g2h
+				 * worker in case of g2h response timeout")
+				 *
+				 * TODO: Drop this change once workqueue scheduling delay issue is
+				 * fixed on LNL Hybrid CPU.
+				 */
+				flush_workqueue(xe->ordered_wq);
+				err = do_compare(addr, args->value, args->mask, args->op);
+				if (err <= 0)
+					break;
+			}
  			err = -ETIME;
  			break;
  		}





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux