On Thu, 20 Apr 2023 08:43:52 -0700, Rodrigo Vivi wrote: > Hi Rodrigo, > On Thu, Apr 20, 2023 at 08:57:24AM +0100, Tvrtko Ursulin wrote: > > > > On 19/04/2023 23:10, Dixit, Ashutosh wrote: > > > On Wed, 19 Apr 2023 06:21:27 -0700, Tvrtko Ursulin wrote: > > > > > > > > > > Hi Tvrtko, > > > > > > > On 10/04/2023 23:35, Ashutosh Dixit wrote: > > > > > Instead of erroring out when GuC reset is in progress, block waiting for > > > > > GuC reset to complete which is a more reasonable uapi behavior. > > > > > > > > > > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo) > > > > > > > > > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@xxxxxxxxx> > > > > > --- > > > > > drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++---- > > > > > 1 file changed, 33 insertions(+), 5 deletions(-) > > > > > > > > > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c > > > > > index 9ab8971679fe3..8471a667dfc71 100644 > > > > > --- a/drivers/gpu/drm/i915/i915_hwmon.c > > > > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c > > > > > @@ -51,6 +51,7 @@ struct hwm_drvdata { > > > > > char name[12]; > > > > > int gt_n; > > > > > bool reset_in_progress; > > > > > + wait_queue_head_t waitq; > > > > > }; > > > > > struct i915_hwmon { > > > > > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val) > > > > > static int > > > > > hwm_power_max_write(struct hwm_drvdata *ddat, long val) > > > > > { > > > > > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000) > > > > > + > > > > > + int ret = 0, timeout = GUC_RESET_TIMEOUT; > > > > > > > > Patch looks good to me > > > > > > Great, thanks :) > > > > > > > apart that I am not sure what is the purpose of the timeout? This is just > > > > the sysfs write path or has more callers? > > > > > > It is just the sysfs path, but the sysfs is accessed also by the oneAPI > > > stack (Level 0). In the initial version I also didn't have the timeout > > > thinking that the app can send a signal to the blocked thread to unblock > > > it. I introduced the timeout after Rodrigo brought it up and I am now > > > thinking maybe it's better to have the timeout in the driver since the app > > > has no knowledge of how long GuC resets can take. But I can remove it if > > > you think it's not needed. > > > > Maybe I am missing something but I don't get why we would need to provide a > > timeout facility in sysfs? If the library writes here to configure something > > it already has to expect a blocking write by the nature of a a write(2) and > > sysfs contract. It can take long for any reason so I hope we are not > > guaranteeing some latency number to someone? Or the concern is just about > > things getting stuck? In which case I think Ctrl-C is the answer because > > ETIME is not even listed as an errno for write(2). Hmm, good point. > I suggested the timeout on the other version because of that race, > which is fixed now with this approach. It is probably better to remove > it now to avoid confusions. I'm sorry about that. No problem, I've removed the timeout in the latest version. Thanks for the R-b. Ashutosh