On Tue, May 16, 2023 at 09:20:55AM +0800, Ming Lei wrote: > On Mon, May 15, 2023 at 02:22:18PM -0600, Keith Busch wrote: > > On Mon, May 15, 2023 at 08:52:38AM -0700, Bart Van Assche wrote: > > > On 5/15/23 07:46, Ming Lei wrote: > > > > @@ -48,7 +53,7 @@ blk_mq_sched_allow_merge(struct request_queue *q, struct request *rq, > > > > static inline void blk_mq_sched_completed_request(struct request *rq, u64 now) > > > > { > > > > - if (rq->rq_flags & RQF_ELV) { > > > > + if ((rq->rq_flags & RQF_ELV) && !blk_mq_bypass_sched(rq->cmd_flags)) { > > > > struct elevator_queue *e = rq->q->elevator; > > > > if (e->type->ops.completed_request) > > > > @@ -58,7 +63,7 @@ static inline void blk_mq_sched_completed_request(struct request *rq, u64 now) > > > > static inline void blk_mq_sched_requeue_request(struct request *rq) > > > > { > > > > - if (rq->rq_flags & RQF_ELV) { > > > > + if ((rq->rq_flags & RQF_ELV) && !blk_mq_bypass_sched(rq->cmd_flags)) { > > > > struct request_queue *q = rq->q; > > > > struct elevator_queue *e = q->elevator; > > > > > > Has it been considered not to set RQF_ELV for passthrough requests instead > > > of making the above changes? > > > > That sounds like a good idea. It changes more behavior than what Ming is > > targeting here, but after looking through each use for RQF_ELV, I think > > not having that set really is the right thing to do in all cases for > > passthrough requests. > > I did consider that approach. But: > > - RQF_ELV actually means that the request & its tag is allocated from sched tags. > > - if RQF_ELV is cleared for passthrough request, request may be > allocated from sched tags(normal IO) and driver tags(passthrough) at the same time. > This way may cause other problem, such as, breaking blk_mq_hctx_has_requests(). > Meantime it becomes not likely to optimize tags resource utilization in future, > at least for single LUN/NS, no need to keep sched tags & driver tags > in memory at the same time. Isn't that similar to multiple namespaces where some use elevator and others use 'none'? They're all contenting for the same shared driver tags with racing 'has_requests()'. And the passthrough case is special with users of that interface taking on a greater responsibility and generally want the kernel out of the way. I don't think anyone would purposefully run a tag intense workload through that engine at the same time as using a generic one with the scheduler. It definitely should still work, but it doesn't need to be fair, right?