> On Sep 22, 2016, at 9:44 PM, Ming Lei <ming.lei@xxxxxxxxxxxxx> wrote: > >> On Thu, Sep 22, 2016 at 11:17 PM, Jens Axboe <axboe@xxxxxx> wrote: >>> On 09/22/2016 09:12 AM, Christoph Hellwig wrote: >>> >>>> On Thu, Sep 22, 2016 at 09:03:56AM -0600, Jens Axboe wrote: >>>> >>>> Having to grab a mutex, for instance. We invoke ->queue_rq() with >>>> preemption disabled, so I'd hope that would not be the case... What >>>> drivers block in ->queue_rq? >>> >>> >>> I though I had converted a lot of them to GFP_NOIO instead of GFP_ATOMIC >>> allocations, but I can't find any evidence of that. Maybe it was just >>> my imagination, or an unsubmitted patch series. Sorry for the >>> confusion. >> >> >> OK, that makes more sense. Pretty sure we would have had complaints! >> >>>> Loop was another case that was on my radar to get rid of the queue_work >>>> it currently has to do. Josef is currently testing the nbd driver using >>>> this approach, so we should get some numbers there too. If we have to, >>>> we can always bump up the concurrency to mimic more of the behavior of >>>> having multiple workers running on the hardware queue. I'd prefer to >>>> still do that in blk-mq, instead of having drivers reinventing their own >>>> work offload functionality. >>> >>> >>> There should be a lot of numbers in the list archives from the time >>> that Ming did the loop conversion, as I've been trying to steer him >>> that way, and he actually implemented and benchmarked it. >>> >>> We can't just increase the concurrency as a single work_struct item >>> can't be queued multiple times even on a high concurreny workqueue. >> >> >> But we could have more work items, if we had to. Even if loop isn't a >> drop-in replacement for this simpler approach, I think it'll work well >> enough for nbd. The 5% number from Josef is comparing to not having any >> offload at all, I suspect the number from just converting to queue_work >> in the driver would be similar. > > 5% might be a noise. > > From nbd's .queue_rq(), kthread_work should match its case because > one per-nbd mutex is required and all cmds sent to the nbd are serialized, > so using common work items may hurt performance caused by > unnecessary context switch. This is with my multi connection patch with 4 connections per device (so 4 hw queues). The 5% is real but I don't think it'll get better by punting to a worker per command, so this approach is fine for NBD. Thanks, Josef-- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html