Re: [PATCH 05/20] drm/i915: Support for GuC interrupts

"Goel, Akash" <akash.goel@xxxxxxxxx> · Fri, 12 Aug 2016 21:02:38 +0530

On 8/12/2016 8:35 PM, Tvrtko Ursulin wrote:

On 12/08/16 15:31, Goel, Akash wrote:
On 8/12/2016 7:01 PM, Tvrtko Ursulin wrote:
+static void gen9_guc2host_events_work(struct work_struct *work)
+{
+    struct drm_i915_private *dev_priv =
+        container_of(work, struct drm_i915_private,
guc.events_work);
+
+    spin_lock_irq(&dev_priv->irq_lock);
+    /* Speed up work cancellation during disabling guc
interrupts. */
+    if (!dev_priv->guc.interrupts_enabled) {
+        spin_unlock_irq(&dev_priv->irq_lock);
+        return;

I suppose locking for early exit is something about ensuring the
worker
sees the update to dev_priv->guc.interrupts_enabled done on another
CPU?

Yes locking (providing implicit barrier) will ensure that update made
from another CPU is immediately visible to the worker.

What if the disable happens after the unlock above? It would wait in
disable until the irq handler exits.
Most probably it will not have to wait, as irq handler would have
completed if work item began the execution.
Irq handler just queues the work item, which gets scheduled later on.

Using the lock is beneficial for the case where the execution of work
item and interrupt disabling is done around the same time.

Ok maybe I am missing something.

When can the interrupt disabling happen? Will it be controlled by the
debugfs file or is it driver load/unload and suspend/resume?

yes disabling will happen for all the above 3 scenarios.

+static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
u32 gt_iir)
+{
+    bool interrupts_enabled;
+
+    if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) {
+        spin_lock(&dev_priv->irq_lock);
+        interrupts_enabled = dev_priv->guc.interrupts_enabled;
+        spin_unlock(&dev_priv->irq_lock);

Not sure that taking a lock around only this read is needed.

Again same reason as above, to make sure an update made on another CPU
is immediately visible to the irq handler.

I don't get it, see above. :)

Here also If interrupt disabling & ISR execution happens around the same
time then ISR might miss the reset of 'interrupts_enabled' flag and
queue the new work.

What if reset of interrupts_enabled happens just as the ISR releases the
lock?

Then ISR will proceed ahead and queue the work item.

Lock is useful if reset of interrupts_enabled flag just happens before 
the ISR inspects the value of that flag.
Also lock will help when interrupts_enabled flag is set again, next ISR 
will definitely see it as set.

And same applies to the case when interrupt is re-enabled, ISR might
still see the 'interrupts_enabled' flag as false.
It will eventually see the update though.

+        if (interrupts_enabled) {
+            /* Sample the log buffer flush related bits & clear them
+             * out now itself from the message identity register to
+             * minimize the probability of losing a flush interrupt,
+             * when there are back to back flush interrupts.
+             * There can be a new flush interrupt, for different log
+             * buffer type (like for ISR), whilst Host is handling
+             * one (for DPC). Since same bit is used in message
+             * register for ISR & DPC, it could happen that GuC
+             * sets the bit for 2nd interrupt but Host clears out
+             * the bit on handling the 1st interrupt.
+             */
+            u32 msg = I915_READ(SOFT_SCRATCH(15)) &
+                    (GUC2HOST_MSG_CRASH_DUMP_POSTED |
+                     GUC2HOST_MSG_FLUSH_LOG_BUFFER);
+            if (msg) {
+                /* Clear the message bits that are handled */
+                I915_WRITE(SOFT_SCRATCH(15),
+                    I915_READ(SOFT_SCRATCH(15)) & ~msg);

Cache full value of SOFT_SCRATCH(15) so you don't have to mmio read it
twice?

Thought reading it again (just before the update) is bit safer compared
to reading it once, as there is a potential race problem here.
GuC could also write to the SOFT_SCRATCH(15) register, set new events
bit, while Host clears off the bit of handled events.

Don't get it. If there is a race between read and write there still is,
don't see how a second read makes it safer.

Yes can't avoid the race completely by double reads, but can reduce the
race window size.

There was only one thing between the two reads, and that was "if (msg)":

 +            u32 msg = I915_READ(SOFT_SCRATCH(15)) &
 +                    (GUC2HOST_MSG_CRASH_DUMP_POSTED |
 +                     GUC2HOST_MSG_FLUSH_LOG_BUFFER);

 +            if (msg) {

 +                /* Clear the message bits that are handled */
 +                I915_WRITE(SOFT_SCRATCH(15),
 +                    I915_READ(SOFT_SCRATCH(15)) & ~msg);

Also I felt code looked better in current form, as macros
GUC2HOST_MSG_CRASH_DUMP_POSTED & GUC2HOST_MSG_FLUSH_LOG_BUFFER were used
only once.

Will change as per the initial implementation.

     u32 msg = I915_READ(SOFT_SCRATCH(15));
     if (msg & (GUC2HOST_MSG_CRASH_DUMP_POSTED |
            GUC2HOST_MSG_FLUSH_LOG_BUFFER) {
         msg &= ~(GUC2HOST_MSG_CRASH_DUMP_POSTED |
              GUC2HOST_MSG_FLUSH_LOG_BUFFER);
         I915_WRITE(SOFT_SCRATCH(15), msg);
     }

Or:
    u32 msg, flush;

    msg = I915_READ(SOFT_SCRATCH(15));
    flush = msg & (GUC2HOST_MSG_CRASH_DUMP_POSTED |
GUC2HOST_MSG_FLUSH_LOG_BUFFER);
    if (flush) {
        I915_WRITE(SOFT_SCRATCH(15), ~flush);
        ... queue woker ...

?
Thanks much, will change as per the above.

Also, is the RMW outside any locks safe?

Ideally need a way to atomically do the RMW, i.e. read the register
value, clear off the handled events bit and update the register with
the
modified value.

Please kindly suggest how to address the above.
Or can this be left as a TODO, when we do start handling other events
also.

From the comment in code above it sounds like a GuC fw interface
shortcoming - that there is a single bit for two different interrupt
sources, is that right?
Yes that shortcoming is there, GUC2HOST_MSG_FLUSH_LOG_BUFFER bit is used
for conveying the flush for ISR & DPC log buffers.

Is there any other register or something that
you can read to detect that the interrupt has been re-asserted while in
the irq handler?

Although I thought you said before that the GuC will
not do that - that it won't re-assert the interrupt before we send the
flush command.
Yes that is the case, but with respect to one type of a log buffer, like
for example unless GuC firmware receives the ack for DPC log
buffer it won't send a new flush for DPC buffer, but if meanwhile ISR
buffer becomes half full it will send a flush interrupt.

So the potential for losing interrupts is unavoidable it seems. :( If I
am understanding this correctly.

Yes unavoidable, but that's a very small window.
Have apprised GuC folks about this.

Best Regards
Akash

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx