Quoting Lis, Tomasz (2018-03-28 17:06:58) > > > On 2018-03-28 01:27, Chris Wilson wrote: > > Quoting Tomasz Lis (2018-03-27 16:17:59) > >> The patch adds support of preempt-to-idle requesting by setting a proper > >> bit within Execlist Control Register, and receiving preemption result from > >> Context Status Buffer. > >> > >> Preemption in previous gens required a special batch buffer to be executed, > >> so the Command Streamer never preempted to idle directly. In Icelake it is > >> possible, as there is a hardware mechanism to inform the kernel about > >> status of the preemption request. > >> > >> This patch does not cover using the new preemption mechanism when GuC is > >> active. > >> > >> Bspec: 18922 > >> Signed-off-by: Tomasz Lis <tomasz.lis@xxxxxxxxx> > >> --- > >> drivers/gpu/drm/i915/i915_drv.h | 2 ++ > >> drivers/gpu/drm/i915/i915_pci.c | 3 ++- > >> drivers/gpu/drm/i915/intel_device_info.h | 1 + > >> drivers/gpu/drm/i915/intel_lrc.c | 45 +++++++++++++++++++++++++++----- > >> drivers/gpu/drm/i915/intel_lrc.h | 1 + > >> 5 files changed, 45 insertions(+), 7 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > >> index 800230b..c32580b 100644 > >> --- a/drivers/gpu/drm/i915/i915_drv.h > >> +++ b/drivers/gpu/drm/i915/i915_drv.h > >> @@ -2514,6 +2514,8 @@ intel_info(const struct drm_i915_private *dev_priv) > >> ((dev_priv)->info.has_logical_ring_elsq) > >> #define HAS_LOGICAL_RING_PREEMPTION(dev_priv) \ > >> ((dev_priv)->info.has_logical_ring_preemption) > >> +#define HAS_HW_PREEMPT_TO_IDLE(dev_priv) \ > >> + ((dev_priv)->info.has_hw_preempt_to_idle) > >> > >> #define HAS_EXECLISTS(dev_priv) HAS_LOGICAL_RING_CONTEXTS(dev_priv) > >> > >> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c > >> index 4364922..66b6700 100644 > >> --- a/drivers/gpu/drm/i915/i915_pci.c > >> +++ b/drivers/gpu/drm/i915/i915_pci.c > >> @@ -595,7 +595,8 @@ static const struct intel_device_info intel_cannonlake_info = { > >> GEN(11), \ > >> .ddb_size = 2048, \ > >> .has_csr = 0, \ > >> - .has_logical_ring_elsq = 1 > >> + .has_logical_ring_elsq = 1, \ > >> + .has_hw_preempt_to_idle = 1 > >> > >> static const struct intel_device_info intel_icelake_11_info = { > >> GEN11_FEATURES, > >> diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h > >> index 933e316..4eb97b5 100644 > >> --- a/drivers/gpu/drm/i915/intel_device_info.h > >> +++ b/drivers/gpu/drm/i915/intel_device_info.h > >> @@ -98,6 +98,7 @@ enum intel_platform { > >> func(has_logical_ring_contexts); \ > >> func(has_logical_ring_elsq); \ > >> func(has_logical_ring_preemption); \ > >> + func(has_hw_preempt_to_idle); \ > >> func(has_overlay); \ > >> func(has_pooled_eu); \ > >> func(has_psr); \ > >> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > >> index ba7f783..1a22de4 100644 > >> --- a/drivers/gpu/drm/i915/intel_lrc.c > >> +++ b/drivers/gpu/drm/i915/intel_lrc.c > >> @@ -153,6 +153,7 @@ > >> #define GEN8_CTX_STATUS_ACTIVE_IDLE (1 << 3) > >> #define GEN8_CTX_STATUS_COMPLETE (1 << 4) > >> #define GEN8_CTX_STATUS_LITE_RESTORE (1 << 15) > >> +#define GEN11_CTX_STATUS_PREEMPT_IDLE (1 << 29) > >> > >> #define GEN8_CTX_STATUS_COMPLETED_MASK \ > >> (GEN8_CTX_STATUS_COMPLETE | GEN8_CTX_STATUS_PREEMPTED) > >> @@ -183,7 +184,9 @@ static inline bool need_preempt(const struct intel_engine_cs *engine, > >> const struct i915_request *last, > >> int prio) > >> { > >> - return engine->i915->preempt_context && prio > max(rq_prio(last), 0); > >> + return (engine->i915->preempt_context || > >> + HAS_HW_PREEMPT_TO_IDLE(engine->i915)) && > > Well, you haven't actually disabled allocating the preempt_context so... > Yes.. I had mixed feelings about changing needs_preempt_context() now, > as that would mean adding a temporary condition on GuC until the GuC > preemption is merged. > I will add the conditions and disable the allocation in v2 of the patch. > > But at any rate, making this an engine->flag would eliminate one pointer > > dance. > Could be an interesting idea for a separate patch. To land first ;) > >> + prio > max(rq_prio(last), 0); > >> } > >> > >> /** > >> @@ -535,6 +538,25 @@ static void inject_preempt_context(struct intel_engine_cs *engine) > >> execlists_set_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT); > >> } > >> > >> +static void gen11_preempt_to_idle(struct intel_engine_cs *engine) > >> +{ > >> + struct intel_engine_execlists *execlists = &engine->execlists; > >> + > >> + GEM_TRACE("%s\n", engine->name); > >> + > >> + /* > >> + * hardware which HAS_HW_PREEMPT_TO_IDLE(), always also > >> + * HAS_LOGICAL_RING_ELSQ(), so we can assume ctrl_reg is set > >> + */ > >> + GEM_BUG_ON(execlists->ctrl_reg != NULL); > >> + > >> + /* trigger preemption to idle */ > >> + writel(EL_CTRL_PREEMPT_TO_IDLE, execlists->ctrl_reg); > > Future plans? Because just inserting the branch into the setter of > > inject_preempt_context() resolves a lot of conflicts with other work. > My arguments for separate function are: > - better code readability > - keeping the symmetry between execlist and GuC flow - GuC preemption > patches will introduce separate function as well > - only 4 lines of the function would be common > - the name inject_preempt_context() wouldn't match the new purpose, so > renaming would be needed > - reduced self-documenting code due to two separate preempt methods not > having distinct names > > That's all, I don't have any future plans for it. If you want me to > merge the two, let me know. The problem that I am worrying about is that we will duplicate bunch of other code, the actual ELS[PQ] write is the smaller portion. Plus we already have the branch on something much more pleasant. > >> @@ -962,10 +987,13 @@ static void execlists_submission_tasklet(unsigned long data) > >> status, buf[2*head + 1], > >> execlists->active); > >> > >> - if (status & (GEN8_CTX_STATUS_IDLE_ACTIVE | > >> - GEN8_CTX_STATUS_PREEMPTED)) > >> + /* Check if switched to active or preempted to active */ > >> + if ((status & (GEN8_CTX_STATUS_IDLE_ACTIVE | > >> + GEN8_CTX_STATUS_PREEMPTED)) && > >> + !(status & GEN11_CTX_STATUS_PREEMPT_IDLE)) > > Setting HWACK here is harmless as it gets cleared again. Unless, there > > is some oddity in the code flow. > I will check if lack of the change affects test results. > Personally, I would keep this change, even if only for allowing simple > definition of what HWACK flag means. The simple definition is the opposite one, imo. We set the flag after we get the corresponding response from HW; any preemption or activate event must follow the most recent ELSP write. So that will include the preemption event following the preempt-idle write. Then on deciding that the HW is idle, we apply the complication such that execlists->active == 0. (That rule is what breaks the pattern.) -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx