On 2/12/24 11:28, Bart Van Assche wrote:
On 2/9/24 10:12, Jens Axboe wrote:
Greg, can you elaborate on how useful cancel is for gadgets? Is it one
of those things that was wired up "just because", or does it have
actually useful cases?
I found two use cases in the Android Open Source Project and have submitted
CLs that request to remove the io_cancel() calls from that code. Although I
think I understand why these calls were added, the race conditions that
these io_cancel() calls try to address cannot be addressed completely by
calling io_cancel().
(replying to my own e-mail) The adb daemon (adbd) maintainers asked me to
preserve the I/O cancellation code in adbd because it was introduced recently
in that code to fix an important bug. Does everyone agree with the approach of
the untested patches below?
Thanks,
Bart.
-----------------------------------------------------------------------------
[PATCH 1/2] fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted
by libaio
If kiocb_set_cancel_fn() is called for I/O submitted by io_uring, the
following kernel warning appears:
WARNING: CPU: 3 PID: 368 at fs/aio.c:598 kiocb_set_cancel_fn+0x9c/0xa8
Call trace:
kiocb_set_cancel_fn+0x9c/0xa8
ffs_epfile_read_iter+0x144/0x1d0
io_read+0x19c/0x498
io_issue_sqe+0x118/0x27c
io_submit_sqes+0x25c/0x5fc
__arm64_sys_io_uring_enter+0x104/0xab0
invoke_syscall+0x58/0x11c
el0_svc_common+0xb4/0xf4
do_el0_svc+0x2c/0xb0
el0_svc+0x2c/0xa4
el0t_64_sync_handler+0x68/0xb4
el0t_64_sync+0x1a4/0x1a8
Fix this by setting the IOCB_AIO_RW flag for read and write I/O that is
submitted by libaio.
Suggested-by: Jens Axboe <axboe@xxxxxxxxx>
---
fs/aio.c | 9 ++++++++-
include/linux/fs.h | 2 ++
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/fs/aio.c b/fs/aio.c
index bb2ff48991f3..da18dbcfcb22 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -593,6 +593,13 @@ void kiocb_set_cancel_fn(struct kiocb *iocb, kiocb_cancel_fn *cancel)
struct kioctx *ctx = req->ki_ctx;
unsigned long flags;
+ /*
+ * kiocb didn't come from aio or is neither a read nor a write, hence
+ * ignore it.
+ */
+ if (!(iocb->ki_flags & IOCB_AIO_RW))
+ return;
+
if (WARN_ON_ONCE(!list_empty(&req->ki_list)))
return;
@@ -1509,7 +1516,7 @@ static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb)
req->ki_complete = aio_complete_rw;
req->private = NULL;
req->ki_pos = iocb->aio_offset;
- req->ki_flags = req->ki_filp->f_iocb_flags;
+ req->ki_flags = req->ki_filp->f_iocb_flags | IOCB_AIO_RW;
if (iocb->aio_flags & IOCB_FLAG_RESFD)
req->ki_flags |= IOCB_EVENTFD;
if (iocb->aio_flags & IOCB_FLAG_IOPRIO) {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index ed5966a70495..c2dcc98cb4c8 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -352,6 +352,8 @@ enum rw_hint {
* unrelated IO (like cache flushing, new IO generation, etc).
*/
#define IOCB_DIO_CALLER_COMP (1 << 22)
+/* kiocb is a read or write operation submitted by fs/aio.c. */
+#define IOCB_AIO_RW (1 << 23)
/* for use in trace events */
#define TRACE_IOCB_STRINGS \
-----------------------------------------------------------------------------
[PATCH 2/2] fs/aio: Make io_cancel() generate completions again
The following patch accidentally removed the code for delivering
completions for cancelled reads and writes to user space: "[PATCH 04/33]
aio: remove retry-based AIO"
(https://lore.kernel.org/all/1363883754-27966-5-git-send-email-koverstreet@xxxxxxxxxx/)
From that patch:
- if (kiocbIsCancelled(iocb)) {
- ret = -EINTR;
- aio_complete(iocb, ret, 0);
- /* must not access the iocb after this */
- goto out;
- }
This leads to a leak in user space of a struct iocb. Hence this patch
that restores the code that reports to user space that a read or write
has been cancelled successfully.
---
fs/aio.c | 27 +++++++++++----------------
1 file changed, 11 insertions(+), 16 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index da18dbcfcb22..28223f511931 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -2165,14 +2165,11 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id,
#endif
/* sys_io_cancel:
- * Attempts to cancel an iocb previously passed to io_submit. If
- * the operation is successfully cancelled, the resulting event is
- * copied into the memory pointed to by result without being placed
- * into the completion queue and 0 is returned. May fail with
- * -EFAULT if any of the data structures pointed to are invalid.
- * May fail with -EINVAL if aio_context specified by ctx_id is
- * invalid. May fail with -EAGAIN if the iocb specified was not
- * cancelled. Will fail with -ENOSYS if not implemented.
+ * Attempts to cancel an iocb previously passed to io_submit(). If the
+ * operation is successfully cancelled 0 is returned. May fail with
+ * -EFAULT if any of the data structures pointed to are invalid. May
+ * fail with -EINVAL if aio_context specified by ctx_id is invalid. Will
+ * fail with -ENOSYS if not implemented.
*/
SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
struct io_event __user *, result)
@@ -2203,14 +2200,12 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
}
spin_unlock_irq(&ctx->ctx_lock);
- if (!ret) {
- /*
- * The result argument is no longer used - the io_event is
- * always delivered via the ring buffer. -EINPROGRESS indicates
- * cancellation is progress:
- */
- ret = -EINPROGRESS;
- }
+ /*
+ * The result argument is no longer used - the io_event is always
+ * delivered via the ring buffer.
+ */
+ if (ret == 0 && kiocb->rw.ki_flags & IOCB_AIO_RW)
+ aio_complete_rw(&kiocb->rw, -EINTR);
percpu_ref_put(&ctx->users);
-----------------------------------------------------------------------------