On Wed, Jun 08, 2016 at 11:16:13AM +0100, Tvrtko Ursulin wrote: > > On 08/06/16 10:48, Chris Wilson wrote: > >On Tue, Jun 07, 2016 at 01:04:22PM +0100, Tvrtko Ursulin wrote: > >>>+static int intel_breadcrumbs_signaler(void *arg) > >>>+{ > >>>+ struct intel_engine_cs *engine = arg; > >>>+ struct intel_breadcrumbs *b = &engine->breadcrumbs; > >>>+ struct signal *signal; > >>>+ > >>>+ /* Install ourselves with high priority to reduce signalling latency */ > >>>+ signaler_set_rtpriority(); > >>>+ > >>>+ do { > >>>+ set_current_state(TASK_INTERRUPTIBLE); > >>>+ > >>>+ /* We are either woken up by the interrupt bottom-half, > >>>+ * or by a client adding a new signaller. In both cases, > >>>+ * the GPU seqno may have advanced beyond our oldest signal. > >>>+ * If it has, propagate the signal, remove the waiter and > >>>+ * check again with the next oldest signal. Otherwise we > >>>+ * need to wait for a new interrupt from the GPU or for > >>>+ * a new client. > >>>+ */ > >>>+ signal = READ_ONCE(b->first_signal); > >>>+ if (signal_complete(signal)) { > >>>+ /* Wake up all other completed waiters and select the > >>>+ * next bottom-half for the next user interrupt. > >>>+ */ > >>>+ intel_engine_remove_wait(engine, &signal->wait); > >>>+ > >>>+ i915_gem_request_unreference(signal->request); > >>>+ > >>>+ /* Find the next oldest signal. Note that as we have > >>>+ * not been holding the lock, another client may > >>>+ * have installed an even older signal than the one > >>>+ * we just completed - so double check we are still > >>>+ * the oldest before picking the next one. > >>>+ */ > >>>+ spin_lock(&b->lock); > >>>+ if (signal == b->first_signal) > >>>+ b->first_signal = rb_next(&signal->node); > >>>+ rb_erase(&signal->node, &b->signals); > >>>+ spin_unlock(&b->lock); > >>>+ > >>>+ kfree(signal); > >>>+ } else { > >>>+ if (kthread_should_stop()) > >>>+ break; > >>>+ > >>>+ schedule(); > >>>+ } > >>>+ } while (1); > >>>+ > >>>+ return 0; > >>>+} > >> > >>So the thread is only because it is convenient to plug it in the > >>breadcrumbs infrastructure. Otherwise the processing above could be > >>done from a lighter weight context as well since nothing seems to > >>need the process context. > > > >No, seqno processing requires process/sleepable context. The delays we > >incur can be >100us and not suitable for irq/softirq context. > > Nothing in this patch needs it - please say in the commit why it is > choosing the process context then. Bottom half processing requires it. irq_seqno_barrier is not suitable for irq/softirq context. > And why so long delays? It looks pretty lightweight to me. > > >>One alternative could perhaps be to add a waiter->wake_up vfunc and > >>signalers could then potentially use a tasklet? > > > >Hmm, I did find that in order to reduce execlists latency, I had to > >drive the tasklet processing from the signaler. > > What do you mean? The existing execlists tasklet? Now would that work? Due to how dma-fence signals, the softirq is never kicked (spin_lock_irq doesn't handle local_bh_enable()) and so we would only submit a new task via execlists on a reschedule. That latency added about 30% (30s on bsw) to gem_exec_parallel. > >>>+int intel_engine_enable_signaling(struct drm_i915_gem_request *request) > >>>+{ > >>>+ struct intel_engine_cs *engine = request->engine; > >>>+ struct intel_breadcrumbs *b = &engine->breadcrumbs; > >>>+ struct rb_node *parent, **p; > >>>+ struct signal *signal; > >>>+ bool first, wakeup; > >>>+ > >>>+ if (unlikely(IS_ERR(b->signaler))) > >>>+ return PTR_ERR(b->signaler); > >> > >>I don't see that there is a fallback is kthread creation failed. It > >>should just fail in intel_engine_init_breadcrumbs if that happens. > > > >Because it is not fatal to using the GPU, just one optional function. > > But we never expect it to fail and it is not even dependent on > anything user controllable. Just a random error which would cause > user experience to degrade. If thread creation failed it means > system is in such a poor shape I would just fail the driver init. A minimally functional system is better than nothing at all. GEM is not required for driver loading, interrupt driven dma-fences less so. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx