On 04/08/2016 12:06 PM, Keith Busch wrote:
On Fri, Apr 08, 2016 at 01:40:06PM -0400, Matthew Wilcox wrote:
- Inability to use all queues supported by a device. Intel's P3700
supports 31 queues, but block-mq insists on assigning an even multiple
of CPUs to each queue. So if you have 48 CPUs, it will use 24 queues.
If you have 128 CPUs, it will only use 16 of the queues.
While it'd be better to use all the available h/w resources, that's
actually not the worst part.
The real problems occur when there are more physical/unique CPUs than
h/w queues since blk-mq does not consider CPU topology beyond thread
siblings. With 128 CPUs, blk-mq may use all 31 queues P3700 supports,
but many CPU groups won't share a last-level-cache.
Smarter assignment would reclaim some untapped performance, and we can
share such code prior to the session.
There's definitely room for improvement in the cpu mapping code.
However, on the original complaint, it's by design (or, working as
intended) - this was done to keep the layout symmetrical. It's been
discussed on the mailing lists before. We can have a discussion whether
we should change this or not, of course.
--
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html