On 7/21/23 9:48?AM, Darrick J. Wong wrote: > On Thu, Jul 20, 2023 at 12:13:08PM -0600, Jens Axboe wrote: >> Async dio completions generally happen from hard/soft IRQ context, which >> means that users like iomap may need to defer some of the completion >> handling to a workqueue. This is less efficient than having the original >> issuer handle it, like we do for sync IO, and it adds latency to the >> completions. >> >> Add IOCB_DIO_DEFER, which the issuer can set if it is able to safely >> punt these completions to a safe context. If the dio handler is aware >> of this flag, assign a callback handler in kiocb->dio_complete and >> associated data io kiocb->private. The issuer will then call this handler >> with that data from task context. >> >> No functional changes in this patch. >> >> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> >> --- >> include/linux/fs.h | 34 ++++++++++++++++++++++++++++++++-- >> 1 file changed, 32 insertions(+), 2 deletions(-) >> >> diff --git a/include/linux/fs.h b/include/linux/fs.h >> index 6867512907d6..2c589418a078 100644 >> --- a/include/linux/fs.h >> +++ b/include/linux/fs.h >> @@ -338,6 +338,20 @@ enum rw_hint { >> #define IOCB_NOIO (1 << 20) >> /* can use bio alloc cache */ >> #define IOCB_ALLOC_CACHE (1 << 21) >> +/* >> + * IOCB_DIO_DEFER can be set by the iocb owner, to indicate that the >> + * iocb completion can be passed back to the owner for execution from a safe >> + * context rather than needing to be punted through a workqueue. If this >> + * flag is set, the completion handling may set iocb->dio_complete to a >> + * handler, which the issuer will then call from task context to complete >> + * the processing of the iocb. iocb->private should then also be set to >> + * the argument being passed to this handler. Note that while this provides > > Who should be setting iocb->private? Can I suggest rewording this to: > > "If this flag is set, the bio completion handling may set > iocb->dio_complete to a handler function and iocb->private to context > information for that handler. The issuer should call the handler with > that context information from task context to complete the processing of > the iocb." > > Assuming I've understood what this does from the next patch? :) Yep this is definitely better - thanks, I'll update it! >> + * a task context for the dio_complete() callback, it should only be used >> + * on the completion side for non-IO generating completions. It's fine to >> + * call blocking functions from this callback, but they should not wait for >> + * unrelated IO (like cache flushing, new IO generation, etc). >> + */ >> +#define IOCB_DIO_DEFER (1 << 22) > > Sorry to nitpick names here, but "defer" feels a little vague to me. > Defer what? And to whom? > > This flag means "defer iocb completion to the caller", right? If so, > wouldn't this be better named IOCB_DIO_CALLER_COMP? That is probably better indeed. Naming is hard! CALLER_COMP or ISSUER_COMP would be better and more descriptive. I'll go with your suggestion. >> /* for use in trace events */ >> #define TRACE_IOCB_STRINGS \ >> @@ -351,7 +365,8 @@ enum rw_hint { >> { IOCB_WRITE, "WRITE" }, \ >> { IOCB_WAITQ, "WAITQ" }, \ >> { IOCB_NOIO, "NOIO" }, \ >> - { IOCB_ALLOC_CACHE, "ALLOC_CACHE" } >> + { IOCB_ALLOC_CACHE, "ALLOC_CACHE" }, \ >> + { IOCB_DIO_DEFER, "DIO_DEFER" } >> >> struct kiocb { >> struct file *ki_filp; >> @@ -360,7 +375,22 @@ struct kiocb { >> void *private; >> int ki_flags; >> u16 ki_ioprio; /* See linux/ioprio.h */ >> - struct wait_page_queue *ki_waitq; /* for async buffered IO */ >> + union { >> + /* >> + * Only used for async buffered reads, where it denotes the >> + * page waitqueue associated with completing the read. Valid >> + * IFF IOCB_WAITQ is set. >> + */ >> + struct wait_page_queue *ki_waitq; >> + /* >> + * Can be used for O_DIRECT IO, where the completion handling >> + * is punted back to the issuer of the IO. May only be set >> + * if IOCB_DIO_DEFER is set by the issuer, and the issuer must >> + * then check for presence of this handler when ki_complete is >> + * invoked. > > Might want to reiterate in the comment that kiocb.private should be > passed as @data. OK, will do. -- Jens Axboe