On Mon, Oct 28, 2024 at 04:49:21PM +0200, Jani Nikula wrote: > On Mon, 28 Oct 2024, Lucas De Marchi <lucas.demarchi@xxxxxxxxx> wrote: > > +Jani > > > > On Fri, Sep 01, 2023 at 05:04:02PM +0300, Imre Deak wrote: > >>Disabling HPD polling from i915_hpd_poll_init_work() involves probing > >>all display connectors explicitly to account for lost hotplug > >>interrupts. On some platforms (mostly pre-ICL) with HDMI connectors the > >>I2C EDID bit-banging using udelay() triggers in turn the > >> > >> workqueue: i915_hpd_poll_init_work [i915] hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND > >> > >>warning. > >> > >>Fix the above by scheduling i915_hpd_poll_init_work() on a WQ_UNBOUND > >>workqueue. It's ok to use a system WQ, since i915_hpd_poll_init_work() > >>is properly flushed in intel_hpd_cancel_work(). > >> > >>The connector probing from drm_mode_config::output_poll_work resulting > >>in the same warning is fixed by the next patch. > >> > >>Cc: Tejun Heo <tj@xxxxxxxxxx> > >>Cc: Heiner Kallweit <hkallweit1@xxxxxxxxx> > >>CC: stable@xxxxxxxxxxxxxxx # 6.5 > >>Suggested-by: Tejun Heo <tj@xxxxxxxxxx> > >>Suggested-by: Heiner Kallweit <hkallweit1@xxxxxxxxx> > >>Reported-by: Heiner Kallweit <hkallweit1@xxxxxxxxx> > >>Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9245 > >>Link: https://lore.kernel.org/all/f7e21caa-e98d-e5b5-932a-fe12d27fde9b@xxxxxxxxx > >>Signed-off-by: Imre Deak <imre.deak@xxxxxxxxx> > >>--- > >> drivers/gpu/drm/i915/display/intel_hotplug.c | 4 ++-- > >> 1 file changed, 2 insertions(+), 2 deletions(-) > >> > >>diff --git a/drivers/gpu/drm/i915/display/intel_hotplug.c b/drivers/gpu/drm/i915/display/intel_hotplug.c > >>index e8562f6f8bb44..accc2fec562a0 100644 > >>--- a/drivers/gpu/drm/i915/display/intel_hotplug.c > >>+++ b/drivers/gpu/drm/i915/display/intel_hotplug.c > >>@@ -774,7 +774,7 @@ void intel_hpd_poll_enable(struct drm_i915_private *dev_priv) > >> * As well, there's no issue if we race here since we always reschedule > >> * this worker anyway > >> */ > >>- queue_work(dev_priv->unordered_wq, > >>+ queue_work(system_unbound_wq, > >> &dev_priv->display.hotplug.poll_init_work); > >> } > >> > >>@@ -803,7 +803,7 @@ void intel_hpd_poll_disable(struct drm_i915_private *dev_priv) > >> return; > >> > >> WRITE_ONCE(dev_priv->display.hotplug.poll_enabled, false); > >>- queue_work(dev_priv->unordered_wq, > >>+ queue_work(system_unbound_wq, > > > > This 1y+ patch doesn't apply anymore and now we also have xe to account > > for. I'm reviving this since we are unifying the kernel config in CI > > and now several machines testing i915 start to hit this warning. > > > > Looking at the current code for xe we have: > > > > drivers/gpu/drm/xe/xe_device_types.h: > > > > /** @unordered_wq: used to serialize unordered work, mostly display */ > > struct workqueue_struct *unordered_wq; > > > > > > ... which is, actually, just display. > > > > Jani, should we move this wq to display where it belongs, with the right > > flags, rather than queueing it on system_unbound_wq? > > I think the general answer is: > > 1. Never use i915 or xe core workqueues in display code. > > 2. Use system workqueues where appropriate for display, and cancel the > individual work where needed. While there are legitimate uses for the > dedicated workqueues, I'm guessing a large portion of their use may > be cargo-culting and/or the result of system wq flush going away, and > not based on any real analysis. > > 3. For display stuff, add the minimal necessary workqueues in display > code, and use them where appropriate. Also cancel the individual work > where needed, and flush the entire workqueue only when really > necessary. The entire workqueue flushing is also cargo-culting and/or > the result of system wq flush going away. > > 4. Finally move the display wq init/cleanup to display code. > > Now, I don't know what the answer in this particular case, or many of > the other cases, is. And that's probably one of the reasons we have the > status quo. :( The work scheduled for hotplug events (like the above poll_init_work) are flushed explicitly, so I think the system workqueues should be used for those (as opposed to driver specific workqueues). > BR, > Jani. > > > -- > Jani Nikula, Intel