On Thu, May 04, 2017 at 04:13:51AM +0800, Ming Lei wrote: > On Thu, May 4, 2017 at 12:46 AM, Omar Sandoval <osandov@xxxxxxxxxxx> wrote: > > On Fri, Apr 28, 2017 at 11:15:36PM +0800, Ming Lei wrote: > >> When blk-mq I/O scheduler is used, we need two tags for > >> submitting one request. One is called scheduler tag for > >> allocating request and scheduling I/O, another one is called > >> driver tag, which is used for dispatching IO to hardware/driver. > >> This way introduces one extra per-queue allocation for both tags > >> and request pool, and may not be as efficient as case of none > >> scheduler. > >> > >> Also currently we put a default per-hctx limit on schedulable > >> requests, and this limit may be a bottleneck for some devices, > >> especialy when these devices have a quite big tag space. > >> > >> This patch introduces BLK_MQ_F_SCHED_USE_HW_TAG so that we can > >> allow to use hardware/driver tags directly for IO scheduling if > >> devices's hardware tag space is big enough. Then we can avoid > >> the extra resource allocation and make IO submission more > >> efficient. > >> > >> Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> > >> --- > >> block/blk-mq-sched.c | 10 +++++++++- > >> block/blk-mq.c | 35 +++++++++++++++++++++++++++++------ > >> include/linux/blk-mq.h | 1 + > >> 3 files changed, 39 insertions(+), 7 deletions(-) > > > > One more note on this: if we're using the hardware tags directly, then > > we are no longer limited to q->nr_requests requests in-flight. Instead, > > we're limited to the hw queue depth. We probably want to maintain the > > original behavior, > > That need further investigation, and generally scheduler should be happy with > more requests which can be scheduled. > > We can make it as one follow-up. If we say nr_requests is 256, then we should honor that. So either update nr_requests to reflect the actual depth we're using or resize the hardware tags. > > so I think we need to resize the hw tags in blk_mq_init_sched() if we're using hardware tags. > > That might not be good since hw tags are used by both scheduler and dispatching. What do you mean? If we have BLK_MQ_F_SCHED_USE_HW_TAG set, then they are not used for dispatching, and of course we shouldn't resize the hardware tags if we are using scheduler tags.