On 5/24/23 11:44?AM, Jeff Xu wrote: > Hi Jens, > Thanks for responding. > > On Wed, May 24, 2023 at 8:06?AM Jens Axboe <axboe@xxxxxxxxx> wrote: >> >> On 5/23/23 8:48?PM, Jeff Xu wrote: >>> Hi >>> I have a question on the protection key in io_uring. Today, when a >>> user thread enters the kernel through syscall, PKRU is preserved, and >>> the kernel will respect the PKEY protection of memory. >>> >>> For example: >>> sys_mprotect_pkey((void *)ptr, size, PROT_READ | PROT_WRITE, pkey); >>> pkey_write_deny(pkey); <-- disable write access to pkey for this thread. >>> ret = read(fd, ptr, 1); <-- this will fail in the kernel. >>> >>> I wonder what is the case for io_uring, since read is now async, will >>> kthread have the user thread's PKUR ? >> >> There is no kthread. What can happen is that some operation may be >> punted to the io-wq workers, but these act exactly like a thread created >> by the original task. IOW, if normal threads retain the protection key, >> so will any io-wq io_uring thread. If they don't, they do not. >> > Does this also apply to when the IORING_SETUP_SQPOLL [1] flag is used > ? it mentions a kernel thread is created to perform submission queue > polling. It doesn't matter if it's SQPOLL or one of the io-wq workers, they are created in the same way. For all intents and purposes, they are userspace threads, identical to one you'd get with pthread_create(). Only difference is that they never return to userspace. >>> In theory, it is possible, i.e. from io_uring_enter syscall. But I >>> don't know the implementation details of io_uring, hence asking the >>> expert in this list. >> >> Right, if the IO is done inline, then it won't make a difference if eg >> read(2) is used or IORING_OP_READ (or similar) with io_uring. >> > Can you please clarify what "IO is done inline" means ? i.e. are there > cases that are not inline ? I mean if the execution of it ends up being app -> io_uring_enter() -> do io. For some operations, you could end up with: io_uring_enter() -> punt to io_wq io_wq -> do io either implicitly because the "do io" operation doesn't support nonblocking issue (or ran out of resrouces), or explicitly if you set IOSQE_ASYNC in the SQE you submitted. -- Jens Axboe