Re: [PATCH] drm/i915: Depend on !PREEMPT_RT.

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Wed, 2 Mar 2022 11:42:35 +0000

On 01/03/2022 15:13, Sebastian Andrzej Siewior wrote:
On 2022-03-01 14:27:18 [+0000], Tvrtko Ursulin wrote:
you see:
     0003-drm-i915-Use-preempt_disable-enable_rt-where-recomme.patch
     0004-drm-i915-Don-t-disable-interrupts-on-PREEMPT_RT-duri.patch

Two for the display folks.

     0005-drm-i915-Don-t-check-for-atomic-context-on-PREEMPT_R.patch

What do preempt_disable/enable do on PREEMPT_RT? Thinking if instead the
solution could be to always force the !ATOMIC path (for the whole
_wait_for_atomic macro) on PREEMPT_RT.

Could be one way to handle it. But please don't disable preemption and
or interrupts for longer period of time as all of it increases the
overall latency.

I am looking for your guidance of what is the correct thing here.

Main purpose of this macro on the i915 side is to do short waits on GPU 
registers changing post write from spin-locked sections. But there were 
rare cases when very short waits were needed from unlocked sections, 
shorter than 10us (which is AFAIR what usleep_range documents should be 
a lower limit). Which is why non-atomic path was added to the macro. 
That path uses preempt_disable/enable so it can use local_clock().

All this may, or may not be, compatible with PREEMPT_RT to start with?

Or question phrased differently, how we should implement the <10us waits 
from non-atomic sections under PREEMPT_RT?

Side note: All of these patches is a collection over time. I personally
have only a single i7-sandybridge with i915 and here I don't really
enter all the possible paths here. People report, I patch and look
around and then they are quiet so I assume that it is working.

     0006-drm-i915-Disable-tracing-points-on-PREEMPT_RT.patch

If the issue is only with certain trace points why disable all?

It is a class and it is easier that way.

     0007-drm-i915-skip-DRM_I915_LOW_LEVEL_TRACEPOINTS-with-NO.patch

Didn't quite fully understand, why is this not fixable? Especially thinking
if the option of not blanket disabling all tracepoints in the previous
patch.

The problem is that you can't acquire that lock from within that
trace-point on PREEMPT_RT. On !RT it is possible but it is also
problematic because LOCKDEP does not see possible dead locks unless that
trace-point is enabled.

Oh I meant could the include ordering problem be fixed differently?

"""
[PATCH 07/10] drm/i915: skip DRM_I915_LOW_LEVEL_TRACEPOINTS with
 NOTRACE

The order of the header files is important. If this header file is
included after tracepoint.h was included then the NOTRACE here becomes a
nop. Currently this happens for two .c files which use the tracepoitns
behind DRM_I915_LOW_LEVEL_TRACEPOINTS.
"""

Like these two .c files - can order of includes just be changed in them?

I've been talking to Steven (after
https://lkml.kernel.org/r/20211214115837.6f33a9b2@xxxxxxxxxxxxxxxxxx)
and he wants to come up with something where you can pass a lock as
argument to the tracing-API. That way the lock can be acquired before
the trace event is invoked and lockdep will see it even if the trace
event is disabled.
So there is an idea how to get it to work eventually without disabling
it in the long term.

Making the register a raw_spinlock_t would solve problem immediately but
I am a little worried given the increased latency in a quick test:
    https://lore.kernel.org/all/20211006164628.s2mtsdd2jdbfyf7g@xxxxxxxxxxxxx/

also, this one single hardware but the upper limit atomic-polls is high.

     0008-drm-i915-gt-Queue-and-wait-for-the-irq_work-item.patch

Not sure about why cond_resched was put between irq_work_queue and
irq_work_sync - would it not be like-for-like change to have the two
together?

maybe it loops for a while and an additional scheduling would be nice.

Commit message makes me think _queue already starts the handler on
x86 at least.

Yes, irq_work_queue() triggers the IRQ right away on x86,
irq_work_sync() would wait for it to happen in case it did not happen.
On architectures which don't provide an IRQ-work interrupt, it is
delayed to the HZ tick timer interrupt. So this serves also as an
example in case someone want to copy the code ;)

My question wasn't why is there a need_resched() in there, but why is 
the patch:

+		irq_work_queue(&b->irq_work);
 		cond_resched();
+		irq_work_sync(&b->irq_work);

And not:

+		irq_work_queue(&b->irq_work);
+		irq_work_sync(&b->irq_work);
 		cond_resched();

To preserve like for like, if my understanding of the commit message was 
correct.

     0009-drm-i915-gt-Use-spin_lock_irq-instead-of-local_irq_d.patch

I think this is okay. The part after the unlock is serialized by the tasklet
already.

Slight doubt due the comment:

   local_irq_enable(); /* flush irq_work (e.g. breadcrumb enabling) */

Makes me want to think about it harder but not now.

Clark reported it and confirmed that the warning is gone on RT and
everything appears to work ;)

I will need to think about it harder at some point.

PREEMPT_RT wise there is no synchronisation vs irq_work other than an
actual lock (in case it is needed).

Okay, marking as todo/later for me. Need to see if enabling breadcrumbs 
earlier than it used to be after this patch makes any difference.

Another thing to check is if execlists_context_status_change still needs the
atomic notifier chain with this change.

     0010-drm-i915-Drop-the-irqs_disabled-check.patch

LGTM.

Do you want me to repost that one?

I think it's up to you whether you go one by one, or repost the whole 
series or whatever.

     Revert-drm-i915-Depend-on-PREEMPT_RT.patch

Okay.

And finally for this very patch (the thread I am replying to):

Acked-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

Thanks.

Np - and I've pushed this one.

Regards,

Tvrtko