On Sun, Aug 28, 2016 at 05:37:47PM +0100, Chris Wilson wrote: > Currently we install a callback for performing poll on a dma-buf, > irrespective of the timeout. This involves taking a spinlock, as well as > unnecessary work, and greatly reduces scaling of poll(.timeout=0) across > multiple threads. > > We can query whether the poll will block prior to installing the > callback to make the busy-query fast. > > Single thread: 60% faster > 8 threads on 4 (+4 HT) cores: 600% faster Hmm, this only really applies to the idle case. reservation_object_test_signaled_rcu() is still a major bottleneck when busy, due to the dance inside reservation_object_test_signaled_single() -Chris -- Chris Wilson, Intel Open Source Technology Centre -- To unsubscribe from this list: send the line "unsubscribe linux-media" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html