On 10/13/19 9:42 AM, Dmitrii Dolgov wrote: > To trace io_uring activity one can get an information from workqueue and > io trace events, but looks like some parts could be hard to identify via > this approach. Making what happens inside io_uring more transparent is > important to be able to reason about many aspects of it, hence introduce > the set of tracing events. > > All such events could be roughly divided into two categories: > > * those, that are helping to understand correctness (from both kernel > and an application point of view). E.g. a ring creation, file > registration, or waiting for available CQE. Proposed approach is to > get a pointer to an original structure of interest (ring context, or > request), and then find relevant events. io_uring_queue_async_work > also exposes a pointer to work_struct, to be able to track down > corresponding workqueue events. > > * those, that provide performance related information. Mostly it's about > events that change the flow of requests, e.g. whether an async work > was queued, or delayed due to some dependencies. Another important > case is how io_uring optimizations (e.g. registered files) are > utilized. I like this in general, a few questions below. > diff --git a/fs/io_uring.c b/fs/io_uring.c > index bfbb7ab3c4e..730f7182b2a 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -71,6 +71,9 @@ > #include <linux/sizes.h> > #include <linux/hugetlb.h> > > +#define CREATE_TRACE_POINTS > +#include <trace/events/io_uring.h> > + > #include <uapi/linux/io_uring.h> > > #include "internal.h" > @@ -483,6 +486,7 @@ static inline void io_queue_async_work(struct io_ring_ctx *ctx, > } > } > > + trace_io_uring_queue_async_work(ctx, rw, req, &req->work, req->flags); > queue_work(ctx->sqo_wq[rw], &req->work); > } > > @@ -707,6 +711,7 @@ static void io_fail_links(struct io_kiocb *req) > { > struct io_kiocb *link; > > + trace_io_uring_fail_links(req); > while (!list_empty(&req->link_list)) { > link = list_first_entry(&req->link_list, struct io_kiocb, list); > list_del(&link->list); Doesn't look like you have completion events, which makes it hard to tell which dependants got killed when failing the links. Maybe a good thing to add? > @@ -1292,6 +1297,7 @@ static ssize_t io_import_iovec(struct io_ring_ctx *ctx, int rw, > iovec, iter); > #endif > > + trace_io_uring_import_iovec(ctx, buf); > return import_iovec(rw, buf, sqe_len, UIO_FASTIOV, iovec, iter); > } > Not sure I see much merrit in this trace event. > @@ -2021,6 +2027,7 @@ static int io_req_defer(struct io_ring_ctx *ctx, struct io_kiocb *req, > req->submit.sqe = sqe_copy; > > INIT_WORK(&req->work, io_sq_wq_submit_work); > + trace_io_uring_defer(ctx, req, false); > list_add_tail(&req->list, &ctx->defer_list); > spin_unlock_irq(&ctx->completion_lock); > return -EIOCBQUEUED; > @@ -2327,6 +2334,7 @@ static int io_req_set_file(struct io_ring_ctx *ctx, const struct sqe_submit *s, > } else { > if (s->needs_fixed_file) > return -EBADF; > + trace_io_uring_file_get(ctx, fd); > req->file = io_file_get(state, fd); > if (unlikely(!req->file)) > return -EBADF; > @@ -2357,6 +2365,8 @@ static int __io_queue_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req, > INIT_WORK(&req->work, io_sq_wq_submit_work); > io_queue_async_work(ctx, req); > } > + else > + trace_io_uring_add_to_prev(ctx, req); > > /* > * Queued up for async execution, worker will release Maybe put this one in io_add_to_prev_work()? Probably just using the 'ret' as part of the trace, to avoid a branch for this? Failing that, the style is off a bit, should be: } else { trace_io_uring_add_to_prev(ctx, req); } > @@ -4194,6 +4210,9 @@ SYSCALL_DEFINE4(io_uring_register, unsigned int, fd, unsigned int, opcode, > mutex_lock(&ctx->uring_lock); > ret = __io_uring_register(ctx, opcode, arg, nr_args); > mutex_unlock(&ctx->uring_lock); > + if (ret >= 0) > + trace_io_uring_register(ctx, opcode, ctx->nr_user_files, > + ctx->nr_user_bufs, ctx->cq_ev_fd != NULL); Just trace 'ret' as well? > + * io_uring_add_to_prev - called after a request was added into a previously > + * submitted work > + * > + * @ctx: pointer to a ring context structure > + * @req: pointer to a request, added to a previous > + * > + * Allows to track merged work, to figure out how oftern requests are piggy often -- Jens Axboe