Re: [RFC v2] io_uring: add set of tracing events

Dmitry Dolgov <9erthalion6@xxxxxxxxx> · Mon, 14 Oct 2019 11:43:25 +0200

> On Sun, Oct 13, 2019 at 11:45 PM Jens Axboe <axboe@xxxxxxxxx> wrote:
>
> > diff --git a/fs/io_uring.c b/fs/io_uring.c
> > index bfbb7ab3c4e..730f7182b2a 100644
> > --- a/fs/io_uring.c
> > +++ b/fs/io_uring.c
> > @@ -71,6 +71,9 @@
> >   #include <linux/sizes.h>
> >   #include <linux/hugetlb.h>
> >
> > +#define CREATE_TRACE_POINTS
> > +#include <trace/events/io_uring.h>
> > +
> >   #include <uapi/linux/io_uring.h>
> >
> >   #include "internal.h"
> > @@ -483,6 +486,7 @@ static inline void io_queue_async_work(struct io_ring_ctx *ctx,
> >               }
> >       }
> >
> > +     trace_io_uring_queue_async_work(ctx, rw, req, &req->work, req->flags);
> >       queue_work(ctx->sqo_wq[rw], &req->work);
> >   }
> >
> > @@ -707,6 +711,7 @@ static void io_fail_links(struct io_kiocb *req)
> >   {
> >       struct io_kiocb *link;
> >
> > +     trace_io_uring_fail_links(req);
> >       while (!list_empty(&req->link_list)) {
> >               link = list_first_entry(&req->link_list, struct io_kiocb, list);
> >               list_del(&link->list);
>
> Doesn't look like you have completion events, which makes it hard to
> tell which dependants got killed when failing the links. Maybe a good
> thing to add?

Do you mean an event like "this dependant got killed due to failing a link".
Yes, sounds useful.

> > @@ -1292,6 +1297,7 @@ static ssize_t io_import_iovec(struct io_ring_ctx *ctx, int rw,
> >                                               iovec, iter);
> >   #endif
> >
> > +     trace_io_uring_import_iovec(ctx, buf);
> >       return import_iovec(rw, buf, sqe_len, UIO_FASTIOV, iovec, iter);
> >   }
> >
>
> Not sure I see much merrit in this trace event.

Yep. The original idea was to somehow expose the information about "not using"
ctx->user_bufs, but after playing around with this events, I see that this one
called more often as I thought, and probably just confusing.

> > @@ -2021,6 +2027,7 @@ static int io_req_defer(struct io_ring_ctx *ctx, struct io_kiocb *req,
> >       req->submit.sqe = sqe_copy;
> >
> >       INIT_WORK(&req->work, io_sq_wq_submit_work);
> > +     trace_io_uring_defer(ctx, req, false);
> >       list_add_tail(&req->list, &ctx->defer_list);
> >       spin_unlock_irq(&ctx->completion_lock);
> >       return -EIOCBQUEUED;
> > @@ -2327,6 +2334,7 @@ static int io_req_set_file(struct io_ring_ctx *ctx, const struct sqe_submit *s,
> >       } else {
> >               if (s->needs_fixed_file)
> >                       return -EBADF;
> > +             trace_io_uring_file_get(ctx, fd);
> >               req->file = io_file_get(state, fd);
> >               if (unlikely(!req->file))
> >                       return -EBADF;
> > @@ -2357,6 +2365,8 @@ static int __io_queue_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
> >                               INIT_WORK(&req->work, io_sq_wq_submit_work);
> >                               io_queue_async_work(ctx, req);
> >                       }
> > +                     else
> > +                             trace_io_uring_add_to_prev(ctx, req);
> >
> >                       /*
> >                        * Queued up for async execution, worker will release
>
> Maybe put this one in io_add_to_prev_work()? Probably just using the
> 'ret' as part of the trace, to avoid a branch for this?

I thought about this, but then there will be no pointer to the context, and it
would be harder to figure out to what exactly this event corresponds to.

On the other hand we can track down the pointer to req, and then everything
should be clear. It involved a bit more efforts to analyze, but probably it
acceptable. So, I will indeed it move into io_add_to_prev_work.

> Failing that, the style is off a bit, should be:
>
>         } else {
>                 trace_io_uring_add_to_prev(ctx, req);
>         }
>
> > @@ -4194,6 +4210,9 @@ SYSCALL_DEFINE4(io_uring_register, unsigned int, fd, unsigned int, opcode,
> >       mutex_lock(&ctx->uring_lock);
> >       ret = __io_uring_register(ctx, opcode, arg, nr_args);
> >       mutex_unlock(&ctx->uring_lock);
> > +     if (ret >= 0)
> > +             trace_io_uring_register(ctx, opcode, ctx->nr_user_files,
> > +                                                             ctx->nr_user_bufs, ctx->cq_ev_fd != NULL);
>
> Just trace 'ret' as well?
> > + * io_uring_add_to_prev - called after a request was added into a previously
> > + *                                             submitted work
> > + *
> > + * @ctx:     pointer to a ring context structure
> > + * @req:     pointer to a request, added to a previous
> > + *
> > + * Allows to track merged work, to figure out how oftern requests are piggy
>
> often

Sure, will fix both typo and 'ret' tracing. Thanks for the comments!