Re: [PATCH] drm/i915: Depend on !PREEMPT_RT.

Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> · Wed, 2 Mar 2022 13:32:53 +0100

On 2022-03-02 11:42:35 [+0000], Tvrtko Ursulin wrote:
> > > >      0005-drm-i915-Don-t-check-for-atomic-context-on-PREEMPT_R.patch
> > > 
> > > What do preempt_disable/enable do on PREEMPT_RT? Thinking if instead the
> > > solution could be to always force the !ATOMIC path (for the whole
> > > _wait_for_atomic macro) on PREEMPT_RT.
> > 
> > Could be one way to handle it. But please don't disable preemption and
> > or interrupts for longer period of time as all of it increases the
> > overall latency.
> 
> I am looking for your guidance of what is the correct thing here.
> 
> Main purpose of this macro on the i915 side is to do short waits on GPU
> registers changing post write from spin-locked sections. But there were rare
> cases when very short waits were needed from unlocked sections, shorter than
> 10us (which is AFAIR what usleep_range documents should be a lower limit).
> Which is why non-atomic path was added to the macro. That path uses
> preempt_disable/enable so it can use local_clock().
>
> All this may, or may not be, compatible with PREEMPT_RT to start with?

Your assumption about atomic is not correct and that is why I aim to
ignore for RT. Or maybe alter so it fits.
It is assumed, that in_atomic() is true in an interrupts handler or with
an acquired spinlock_t, right? Both condition keep the context
preemptible so the atomic check triggers. However, both (the force
threaded interrupt handler and the spinlock_t) ensure that the task is
stuck on the CPU.

So maybe your _WAIT_FOR_ATOMIC_CHECK() could point to cant_migrate().
It looks like you try to ensure that local_clock() is from the same CPU.

> Or question phrased differently, how we should implement the <10us waits
> from non-atomic sections under PREEMPT_RT?

I think if you swap check in _WAIT_FOR_ATOMIC_CHECK() it should be good.
After all the remains preemptible during the condition polls so it
should work.

> > The problem is that you can't acquire that lock from within that
> > trace-point on PREEMPT_RT. On !RT it is possible but it is also
> > problematic because LOCKDEP does not see possible dead locks unless that
> > trace-point is enabled.
> 
> Oh I meant could the include ordering problem be fixed differently?
> 
> """
> [PATCH 07/10] drm/i915: skip DRM_I915_LOW_LEVEL_TRACEPOINTS with
>  NOTRACE
> 
> The order of the header files is important. If this header file is
> included after tracepoint.h was included then the NOTRACE here becomes a
> nop. Currently this happens for two .c files which use the tracepoitns
> behind DRM_I915_LOW_LEVEL_TRACEPOINTS.
> """
> 
> Like these two .c files - can order of includes just be changed in them?

Maybe. Let me check and get back to you.

> > I've been talking to Steven (after
> > https://lkml.kernel.org/r/20211214115837.6f33a9b2@xxxxxxxxxxxxxxxxxx)
> > and he wants to come up with something where you can pass a lock as
> > argument to the tracing-API. That way the lock can be acquired before
> > the trace event is invoked and lockdep will see it even if the trace
> > event is disabled.
> > So there is an idea how to get it to work eventually without disabling
> > it in the long term.
> > 
> > Making the register a raw_spinlock_t would solve problem immediately but
> > I am a little worried given the increased latency in a quick test:
> >     https://lore.kernel.org/all/20211006164628.s2mtsdd2jdbfyf7g@xxxxxxxxxxxxx/
> > 
> > also, this one single hardware but the upper limit atomic-polls is high.
> > 
> > > >      0008-drm-i915-gt-Queue-and-wait-for-the-irq_work-item.patch
> > > 
> > > Not sure about why cond_resched was put between irq_work_queue and
> > > irq_work_sync - would it not be like-for-like change to have the two
> > > together?
> > 
> > maybe it loops for a while and an additional scheduling would be nice.
> > 
> > > Commit message makes me think _queue already starts the handler on
> > > x86 at least.
> > 
> > Yes, irq_work_queue() triggers the IRQ right away on x86,
> > irq_work_sync() would wait for it to happen in case it did not happen.
> > On architectures which don't provide an IRQ-work interrupt, it is
> > delayed to the HZ tick timer interrupt. So this serves also as an
> > example in case someone want to copy the code ;)
> 
> My question wasn't why is there a need_resched() in there, but why is the
> patch:
> 
> +		irq_work_queue(&b->irq_work);
>  		cond_resched();
> +		irq_work_sync(&b->irq_work);
> 
> And not:
> 
> +		irq_work_queue(&b->irq_work);
> +		irq_work_sync(&b->irq_work);
>  		cond_resched();
> 
> To preserve like for like, if my understanding of the commit message was
> correct.

No strong need, it can be put as you suggest.
Should someone else schedule &b->irq_work from another CPU then you
could first attempt to cond_resched() and then wait for &b->irq_work's
completion. Assuming that this does not happen (because the irq_work was
previously queued and invoked immediately) irq_work_sync) will just return.

> > > >      0009-drm-i915-gt-Use-spin_lock_irq-instead-of-local_irq_d.patch
> > > 
> > > I think this is okay. The part after the unlock is serialized by the tasklet
> > > already.
> > > 
> > > Slight doubt due the comment:
> > > 
> > >    local_irq_enable(); /* flush irq_work (e.g. breadcrumb enabling) */
> > > 
> > > Makes me want to think about it harder but not now.
> > 
> > Clark reported it and confirmed that the warning is gone on RT and
> > everything appears to work ;)
> 
> I will need to think about it harder at some point.
> 
> > PREEMPT_RT wise there is no synchronisation vs irq_work other than an
> > actual lock (in case it is needed).
> 
> Okay, marking as todo/later for me. Need to see if enabling breadcrumbs
> earlier than it used to be after this patch makes any difference.

Okay.

> > > Another thing to check is if execlists_context_status_change still needs the
> > > atomic notifier chain with this change.
> > > 
> > > >      0010-drm-i915-Drop-the-irqs_disabled-check.patch
> > > 
> > > LGTM.
> > 
> > Do you want me to repost that one?
> 
> I think it's up to you whether you go one by one, or repost the whole series
> or whatever.

If it is up to me then let me repost that one single patch and I have it
out of my queue :) And
0008-drm-i915-gt-Queue-and-wait-for-the-irq_work-item.patch without
cond_resched() in the middle.

> Regards,
> 
> Tvrtko

Sebastian