On Thu, Jul 16, 2020 at 11:17:55PM +1000, Aleksa Sarai wrote: > On 2020-07-15, Kees Cook <keescook@xxxxxxxxxxxx> wrote: > > In the basic case of "I want to run strace", this is really just a > > creative use of ptrace in that interception is being used only for > > reporting. Does ptrace need to grow a way to create/attach an io_uring > > eventfd? Or should there be an entirely different tool for > > administrative analysis of io_uring events (kind of how disk IO can be > > monitored)? > > I would hope that we wouldn't introduce ptrace to io_uring, because > unless we plan to attach to io_uring events via GDB it's simply the > wrong tool for the job. strace does use ptrace, but that's mostly > because Linux's dynamic tracing was still in its infancy at the time > (and even today it requires more privileges than ptrace) -- but you can > emulate strace using bpftrace these days fairly easily. > > So really what is being asked here is "can we make it possible to debug > io_uring programs as easily as traditional I/O programs". And this does > not require ptrace, nor should ptrace be part of this discussion IMHO. I > believe this issue (along with seccomp-style filtering) have been > mentioned informally in the past, but I am happy to finally see a thread > about this appear. Yeah, I don't see any sane way to attach ptrace, especially when what's wanted is just "io_uring action logging", which is a much more narrow issue, and one that doesn't map well to processes. Can the io_uring eventfd be used for this kind of thing? It seems io_uring just needs a way to gain an administrative path to opening it? > > Solving the mapping of seccomp interception types into CQEs (or anything > > more severe) will likely inform what it would mean to map ptrace events > > to CQEs. So, I think they're related, and we should get seccomp hooked > > up right away, and that might help us see how (if) ptrace should be > > attached. > > We could just emulate the seccomp-bpf API with the pseudo-syscalls done > as a result of CQEs, though I'm not sure how happy folks will be with > this kind of glue code in "seccomp-uring" (though in theory it would > allow us to attach existing filters to io_uring...). Looking at the per-OP "syscall" implementations, I'm kind of alarmed that some (e.g. openat2) are rather "open coded". It seems like this should be fixed to have at least a common entry point for both io_uring and proper syscalls. -- Kees Cook