On Sun, 2020-05-24 at 10:40 -0600, Jens Axboe wrote: > On 5/24/20 10:30 AM, Jens Axboe wrote: > > On 5/24/20 8:05 AM, Trond Myklebust wrote: > > > On Sat, 2020-05-23 at 12:57 -0600, Jens Axboe wrote: > > > > Use the async page locking infrastructure, if IOCB_WAITQ is set > > > > in > > > > the > > > > passed in iocb. The caller must expect an -EIOCBQUEUED return > > > > value, > > > > which means that IO is started but not done yet. This is > > > > similar to > > > > how > > > > O_DIRECT signals the same operation. Once the callback is > > > > received by > > > > the caller for IO completion, the caller must retry the > > > > operation. > > > > > > > > Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> > > > > --- > > > > mm/filemap.c | 33 ++++++++++++++++++++++++++------- > > > > 1 file changed, 26 insertions(+), 7 deletions(-) > > > > > > > > diff --git a/mm/filemap.c b/mm/filemap.c > > > > index c746541b1d49..a3b86c9acdc8 100644 > > > > --- a/mm/filemap.c > > > > +++ b/mm/filemap.c > > > > @@ -1219,6 +1219,14 @@ static int > > > > __wait_on_page_locked_async(struct > > > > page *page, > > > > return ret; > > > > } > > > > > > > > +static int wait_on_page_locked_async(struct page *page, > > > > + struct wait_page_queue > > > > *wait) > > > > +{ > > > > + if (!PageLocked(page)) > > > > + return 0; > > > > + return __wait_on_page_locked_async(compound_head(page), > > > > wait, > > > > false); > > > > +} > > > > + > > > > /** > > > > * put_and_wait_on_page_locked - Drop a reference and wait for > > > > it to > > > > be unlocked > > > > * @page: The page to wait for. > > > > @@ -2058,17 +2066,25 @@ static ssize_t > > > > generic_file_buffered_read(struct kiocb *iocb, > > > > index, last_index - > > > > index); > > > > } > > > > if (!PageUptodate(page)) { > > > > - if (iocb->ki_flags & IOCB_NOWAIT) { > > > > - put_page(page); > > > > - goto would_block; > > > > - } > > > > - > > > > /* > > > > * See comment in do_read_cache_page on > > > > why > > > > * wait_on_page_locked is used to avoid > > > > unnecessarily > > > > * serialisations and why it's safe. > > > > */ > > > > - error = > > > > wait_on_page_locked_killable(page); > > > > + if (iocb->ki_flags & IOCB_WAITQ) { > > > > + if (written) { > > > > + put_page(page); > > > > + goto out; > > > > + } > > > > + error = > > > > wait_on_page_locked_async(page, > > > > + > > > > iocb- > > > > > private); > > > > > > If it is being used in 'generic_file_buffered_read()' as storage > > > for a > > > wait queue, then it is hard to consider this a 'private' field. > > > > private isn't the prettiest, and in fact this one in particular is > > a bit > > of a mess. It's not clear if it's caller or callee owned. It's > > generally > > not used, outside of the old usb gadget code, iomap O_DIRECT, and > > ocfs2. > > With FMODE_BUF_RASYNC, the fs obviously can't set it if it uses > > ->private for buffered IO. > > > > > Perhaps either rename and add type checking, or else add a > > > separate > > > field altogether to struct kiocb? > > > > I'd hate to add a new field and increase the size of the kiocb... > > One > > alternative is to do: > > > > union { > > void *private; > > struct wait_page_queue *ki_waitq; > > }; > > > > and still use IOCB_WAITQ to say that ->ki_waitq is valid. > > > > There's also 4 bytes of padding in the kiocb struct. And some > > fields are > > only used for O_DIRECT as well, eg ->ki_cookie which is just used > > for > > polled O_DIRECT. So we could also do: > > > > union { > > unsigned int ki_cookie; > > struct wait_page_queue *ki_waitq; > > }; > > > > and still not grow the kiocb. How about we go with this approach, > > and > > also add: > > > > if (kiocb->ki_flags & IOCB_HIPRI) > > return -EOPNOTSUPP; > > > > to kiocb_wait_page_queue_init() to make sure that this combination > > isn't > > valid? > > Here's the incremental, which is spread over 3 patches. I think this > one > makes sense, as polled IO doesn't support buffered IO. And because > doing > an async callback for completion is not how polled IO operates > anyway, > so even if buffered IO supported it, we'd not use the callback for > polled IO anyway. kiocb_wait_page_queue_init() checks and backs this > up. > > > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 0ef5f5973b1c..f7b1eb765c6e 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -317,7 +317,7 @@ enum rw_hint { > #define IOCB_SYNC (1 << 5) > #define IOCB_WRITE (1 << 6) > #define IOCB_NOWAIT (1 << 7) > -/* iocb->private holds wait_page_async struct */ > +/* iocb->ki_waitq is valid */ > #define IOCB_WAITQ (1 << 8) > > struct kiocb { > @@ -332,7 +332,10 @@ struct kiocb { > int ki_flags; > u16 ki_hint; > u16 ki_ioprio; /* See linux/ioprio.h */ > - unsigned int ki_cookie; /* for ->iopoll */ > + union { > + unsigned int ki_cookie; /* for ->iopoll */ > + struct wait_page_queue *ki_waitq; /* for async > buffered IO */ > + }; > > randomized_struct_fields_end > }; > diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h > index def58de92053..8b65420410ee 100644 > --- a/include/linux/pagemap.h > +++ b/include/linux/pagemap.h > @@ -498,13 +498,16 @@ static inline int > kiocb_wait_page_queue_init(struct kiocb *kiocb, > wait_queue_func_t func, > void *data) > { > + /* Can't support async wakeup with polled IO */ > + if (kiocb->ki_flags & IOCB_HIPRI) > + return -EINVAL; > if (kiocb->ki_filp->f_mode & FMODE_BUF_RASYNC) { > wait->wait.func = func; > wait->wait.private = data; > wait->wait.flags = 0; > INIT_LIST_HEAD(&wait->wait.entry); > kiocb->ki_flags |= IOCB_WAITQ; > - kiocb->private = wait; > + kiocb->ki_waitq = wait; > return 0; > } > > diff --git a/mm/filemap.c b/mm/filemap.c > index a3b86c9acdc8..18022de7dc33 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -2077,7 +2077,7 @@ static ssize_t > generic_file_buffered_read(struct kiocb *iocb, > goto out; > } > error = wait_on_page_locked_async(page, > - iocb- > >private); > + iocb- > >ki_waitq); > } else { > if (iocb->ki_flags & IOCB_NOWAIT) { > put_page(page); > @@ -2173,7 +2173,7 @@ static ssize_t > generic_file_buffered_read(struct kiocb *iocb, > page_not_up_to_date: > /* Get exclusive access to the page ... */ > if (iocb->ki_flags & IOCB_WAITQ) > - error = lock_page_async(page, iocb->private); > + error = lock_page_async(page, iocb->ki_waitq); > else > error = lock_page_killable(page); > if (unlikely(error)) Ack. That seems cleaner to me. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx