On Tue, Nov 28, 2023 at 5:19 PM Jens Axboe <axboe@xxxxxxxxx> wrote: > On 11/28/23 8:58 AM, Jens Axboe wrote: > > On 11/27/23 2:53 PM, Jann Horn wrote: > >> Hi! > >> > >> I noticed something that I think does not currently cause any > >> significant security issues, but could be problematic in the future: > >> > >> io_uring sometimes processes task work in the middle of syscalls, > >> including between fdget() and fdput(). My understanding of task work > >> is that it is expected to run in a context similar to directly at > >> syscall entry/exit: task context, no locks held, sleeping is okay, and > >> it doesn't execute in the middle of some syscall that expects private > >> state of the task_struct to stay the same. > >> > >> An example of another user of task work is the keyring subsystem, > >> which does task_work_add() in keyctl_session_to_parent() to change the > >> cred pointers of another task. > >> > >> Several places in io_uring process task work while holding an fdget() > >> reference to some file descriptor. For example, the io_uring_enter > >> syscall handler calls io_iopoll_check() while the io_ring_ctx is only > >> referenced via fdget(). This means that if there were another kernel > >> subsystem that uses task work to close file descriptors, io_uring > >> would become unsafe. And io_uring does _almost_ that itself, I think: > >> io_queue_worker_create() can be run on a workqueue, and uses task work > >> to launch a worker thread from the context of a userspace thread; and > >> this worker thread can then accept commands to close file descriptors. > >> Except it doesn't accept commands to close io_uring file descriptors. > >> > >> A closer miss might be io_sync_cancel(), which holds a reference to > >> some normal file with fdget()/fdput() while calling into > >> io_run_task_work_sig(). However, from what I can tell, the only things > >> that are actually done with this file pointer are pointer comparisons, > >> so this also shouldn't have significant security impact. > >> > >> Would it make sense to use fget()/fput() instead of fdget()/fdput() in > >> io_sync_cancel(), io_uring_enter and io_uring_register? These > >> functions probably usually run in multithreaded environments anyway > >> (thanks to the io_uring worker threads), so I would think fdget() > >> shouldn't bring significant performance savings here? > > > > Let me run some testing on that. It's a mistake to think that it's > > usually multithreaded, generally if you end up using io-wq then it's not > > a fast path. A fast networked setup, for example, would never touch the > > threads and hence no threading would be implied by using io_uring. Ditto > > on the storage front, if you're just reading/writing or eg doing polled > > IO. That said, those workloads are generally threaded _anyway_ - not > > because of io_uring, but because that's how these kinds of workloads are > > written to begin with. Aah, because with polled I/O, when the fd is signalled as ready, the actual execution of work is done via task_work? Thanks for the explanation, I missed that. > > So probably won't be much of a concern to do the swap. The only > > "interesting" part of the above mix of cancel/register/enter is > > obviously the enter part. The rest are not really fast path. > > Did all three and ran the usual testing, which just so happens to be > multithreaded to begin with anyway. No discernable change from using > fget/fput over fdget/fdput. Oh, nice! > IOW, we may as well do this. Do you want to send a patch? Or I can send > out mine, up to you. Ah, if you already cooked up a patch to do the testing, I guess it's easier if you commit that?