On 11/7/20 3:49 PM, Pavel Begunkov wrote: > On 07/11/2020 22:30, Jens Axboe wrote: >> On 11/7/20 2:16 PM, Pavel Begunkov wrote: >>> SQPOLL task may find sqo_task->files == NULL, so >>> __io_sq_thread_acquire_files() would left it unset and so all the >>> following fails, e.g. attempts to submit. Fail if sqo_task doesn't have >>> files. >>> >>> [ 118.962785] BUG: kernel NULL pointer dereference, address: >>> 0000000000000020 >>> [ 118.963812] #PF: supervisor read access in kernel mode >>> [ 118.964534] #PF: error_code(0x0000) - not-present page >>> [ 118.969029] RIP: 0010:__fget_files+0xb/0x80 >>> [ 119.005409] Call Trace: >>> [ 119.005651] fget_many+0x2b/0x30 >>> [ 119.005964] io_file_get+0xcf/0x180 >>> [ 119.006315] io_submit_sqes+0x3a4/0x950 >>> [ 119.006678] ? io_double_put_req+0x43/0x70 >>> [ 119.007054] ? io_async_task_func+0xc2/0x180 >>> [ 119.007481] io_sq_thread+0x1de/0x6a0 >>> [ 119.007828] kthread+0x114/0x150 >>> [ 119.008135] ? __ia32_sys_io_uring_enter+0x3c0/0x3c0 >>> [ 119.008623] ? kthread_park+0x90/0x90 >>> [ 119.008963] ret_from_fork+0x22/0x30 >>> >>> Reported-by: Josef Grieb <josef.grieb@xxxxxxxxx> >>> Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx> >>> --- >>> fs/io_uring.c | 19 ++++++++++++------- >>> 1 file changed, 12 insertions(+), 7 deletions(-) >>> >>> diff --git a/fs/io_uring.c b/fs/io_uring.c >>> index 8d721a652d61..9c035c5c4080 100644 >>> --- a/fs/io_uring.c >>> +++ b/fs/io_uring.c >>> @@ -1080,7 +1080,7 @@ static void io_sq_thread_drop_mm_files(void) >>> } >>> } >>> >>> -static void __io_sq_thread_acquire_files(struct io_ring_ctx *ctx) >>> +static int __io_sq_thread_acquire_files(struct io_ring_ctx *ctx) >>> { >>> if (!current->files) { >>> struct files_struct *files; >>> @@ -1091,7 +1091,7 @@ static void __io_sq_thread_acquire_files(struct io_ring_ctx *ctx) >>> files = ctx->sqo_task->files; >>> if (!files) { >>> task_unlock(ctx->sqo_task); >>> - return; >>> + return -EFAULT; >> >> I don't think we should use -EFAULT here, it's generally used for trying >> to copy in/out of invalid regions. Probably -ECANCELED is better here, > > Noted, I'll resend after Josef tests this. > >> in lieu of something super appropriate. Maybe -EBADF would be fine too. > > Yeah, something along OWNER_TASK_DEAD would make more sense. You could try and commandeer -EOWNERDEAD for this use case, it does make sense. -- Jens Axboe