Re: strace of io_uring events?

Kees Cook <keescook@xxxxxxxxxxxx> · Thu, 16 Jul 2020 08:19:34 -0700

On Thu, Jul 16, 2020 at 11:17:55PM +1000, Aleksa Sarai wrote:
> On 2020-07-15, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
> > In the basic case of "I want to run strace", this is really just a
> > creative use of ptrace in that interception is being used only for
> > reporting. Does ptrace need to grow a way to create/attach an io_uring
> > eventfd? Or should there be an entirely different tool for
> > administrative analysis of io_uring events (kind of how disk IO can be
> > monitored)?
> 
> I would hope that we wouldn't introduce ptrace to io_uring, because
> unless we plan to attach to io_uring events via GDB it's simply the
> wrong tool for the job. strace does use ptrace, but that's mostly
> because Linux's dynamic tracing was still in its infancy at the time
> (and even today it requires more privileges than ptrace) -- but you can
> emulate strace using bpftrace these days fairly easily.
> 
> So really what is being asked here is "can we make it possible to debug
> io_uring programs as easily as traditional I/O programs". And this does
> not require ptrace, nor should ptrace be part of this discussion IMHO. I
> believe this issue (along with seccomp-style filtering) have been
> mentioned informally in the past, but I am happy to finally see a thread
> about this appear.

Yeah, I don't see any sane way to attach ptrace, especially when what's
wanted is just "io_uring action logging", which is a much more narrow
issue, and one that doesn't map well to processes.

Can the io_uring eventfd be used for this kind of thing? It seems
io_uring just needs a way to gain an administrative path to opening it?

> > Solving the mapping of seccomp interception types into CQEs (or anything
> > more severe) will likely inform what it would mean to map ptrace events
> > to CQEs. So, I think they're related, and we should get seccomp hooked
> > up right away, and that might help us see how (if) ptrace should be
> > attached.
> 
> We could just emulate the seccomp-bpf API with the pseudo-syscalls done
> as a result of CQEs, though I'm not sure how happy folks will be with
> this kind of glue code in "seccomp-uring" (though in theory it would
> allow us to attach existing filters to io_uring...).

Looking at the per-OP "syscall" implementations, I'm kind of alarmed
that some (e.g. openat2) are rather "open coded". It seems like this
should be fixed to have at least a common entry point for both io_uring
and proper syscalls.

-- 
Kees Cook