On 30/11/15 14:34, Chris Wilson wrote:
One particularly stressful scenario consists of many independent tasks all competing for GPU time and waiting upon the results (e.g. realtime transcoding of many, many streams). One bottleneck in particular is that each client waits on its own results, but every client is woken up after every batchbuffer - hence the thunder of hooves as then every client must do its heavyweight dance to read a coherent seqno to see if it is the lucky one. Alternatively, we can have one kthread responsible for waking after an interrupt, checking the seqno and only waking up the waiting clients who are complete. The disadvantage is that in the uncontended scenario (i.e. only one waiter) we incur an extra context switch in the wakeup path - though that should be mitigated somewhat by the busy-wait we do first before sleeping.
This discussion reminds me about an approach we took in [another OS], where the interrupt handler always just woke the first waiter, but that thread, if the wakeup wasn't of interest to itself, then did the extra work to figure out which other thread /should/ be woken. That both minimised latency for the single-waiter scenario, and avoided wake_all() from interrupt code in the multiple-waiter case. Oh, and IIRC we had a yield_to() in there so that the spuriously-woken first waiter went back to waiting and the correctly-woken thread immediately got to take over the CPU :)
I don't know how practical that would be inside Linux though ... .Dave. _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx