Re: [PATCH] block: reduce kblockd_mod_delayed_work_on() CPU consumption

Ming Lei <ming.lei@xxxxxxxxxx> · Thu, 16 Dec 2021 15:22:09 +0800

On Wed, Dec 15, 2021 at 09:40:38AM -0800, Bart Van Assche wrote:
> On 12/14/21 7:59 AM, Jens Axboe wrote:
> > On 12/14/21 8:04 AM, Christoph Hellwig wrote:
> > > So why not do a non-delayed queue_work for that case?  Might be good
> > > to get the scsi and workqueue maintaines involved to understand the
> > > issue a bit better first.
> > 
> > We can probably get by with doing just that, and just ignore if a delayed
> > work timer is already running.
> > 
> > Dexuan, can you try this one?
> > 
> > diff --git a/block/blk-core.c b/block/blk-core.c
> > index 1378d084c770..c1833f95cb97 100644
> > --- a/block/blk-core.c
> > +++ b/block/blk-core.c
> > @@ -1484,6 +1484,8 @@ EXPORT_SYMBOL(kblockd_schedule_work);
> >   int kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork,
> >   				unsigned long delay)
> >   {
> > +	if (!delay)
> > +		return queue_work_on(cpu, kblockd_workqueue, &dwork->work);
> >   	return mod_delayed_work_on(cpu, kblockd_workqueue, dwork, delay);
> >   }
> >   EXPORT_SYMBOL(kblockd_mod_delayed_work_on);
> 
> As Christoph already mentioned, it would be great to receive feedback from the
> workqueue maintainer about this patch since I'm not aware of other kernel code
> that queues delayed_work in a similar way.
> Regarding the feedback from the view of the SCSI subsystem: I'd like to see the
> block layer core track whether or not a queue needs to be run such that the
> scsi_run_queue_async() call can be removed from scsi_end_request(). No such call

scsi_run_queue_async() is just for handling restart from running out of
scsi's device queue limit, which shouldn't be hot now, and it is for
handling scsi's own queue limit.

> was present in the original conversion of the SCSI core from the legacy block
> layer to blk-mq. See also commit d285203cf647 ("scsi: add support for a blk-mq
> based I/O path.").

That isn't true, see scsi_next_command()->scsi_run_queue().

Thanks,
Ming