Stanislav Fomichev <sdf@xxxxxxxxxx> writes: > On 02/28, Toke H�iland-J�rgensen wrote: >> Stanislav Fomichev <sdf@xxxxxxxxxx> writes: > >> > On Mon, Feb 27, 2023 at 3:54 PM Toke H�iland-J�rgensen >> <toke@xxxxxxxxxx> wrote: >> >> >> >> Stanislav Fomichev <sdf@xxxxxxxxxx> writes: >> >> >> >> > On Mon, Feb 27, 2023 at 6:17 AM Toke H�iland-J�rgensen >> <toke@xxxxxxxxxx> wrote: >> >> >> >> >> >> Stanislav Fomichev <sdf@xxxxxxxxxx> writes: >> >> >> >> >> >> > On Thu, Feb 23, 2023 at 3:22 PM Toke H�iland-J�rgensen >> <toke@xxxxxxxxxx> wrote: >> >> >> >> >> >> >> >> Stanislav Fomichev <sdf@xxxxxxxxxx> writes: >> >> >> >> >> >> >> >> > I'd like to discuss a potential follow up for the >> previous "XDP RX >> >> >> >> > metadata" series [0]. >> >> >> >> > >> >> >> >> > Now that we can access (a subset of) packet metadata at RX, >> I'd like to >> >> >> >> > explore the options where we can export some of that metadata >> on TX. And >> >> >> >> > also whether it might be possible to access some of the TX >> completion >> >> >> >> > metadata (things like TX timestamp). >> >> >> >> > >> >> >> >> > I'm currently trying to understand whether the same approach >> I've used >> >> >> >> > on RX could work at TX. By May I plan to have a bunch of >> options laid >> >> >> >> > out (currently considering XSK tx/compl programs and XDP >> tx/compl >> >> >> >> > programs) so we have something to discuss. >> >> >> >> >> >> >> >> I've been looking at ways of getting a TX-completion hook for >> the XDP >> >> >> >> queueing stuff as well. For that, I think it could work to just >> hook >> >> >> >> into xdp_return_frame(), but if you want to access hardware >> metadata >> >> >> >> it'll obviously have to be in the driver. A hook in the driver >> could >> >> >> >> certainly be used for the queueing return as well, though, which >> may >> >> >> >> help making it worth the trouble :) >> >> >> > >> >> >> > Yeah, I'd like to get to completion descriptors ideally; so >> nothing >> >> >> > better than a driver hook comes to mind so far :-( >> >> >> > (I'm eye-balling mlx5's mlx5e_free_xdpsq_desc AF_XDP path mostly >> so far). >> >> >> >> >> >> Is there any other use case for this than getting the TX timestamp? >> Not >> >> >> really sure what else those descriptors contain... >> >> > >> >> > I don't think so; at least looking at mlx5 and bnxt (the latter >> >> > doesn't have a timestamp in the completion ring). >> >> > So yeah, not sure, maybe that should be on the side and be AF_XDP >> specific. >> >> > And not even involve bpf, just put the tx tstamp somewhere in umem: >> >> > setsockopt(xsk_fd, SOL_XDP, XSK_STAMP_TX_COMPLETION, >> >> > &data_relative_offset, ..); >> >> > OTOH, if it is only a timestamp now, it doesn't mean that's all we'd >> >> > have for eternity? (plus, this needs a driver "hook" for af_xdp >> >> > anyway, so why not make it generic?) >> >> >> >> So since this is read-only in any case, we could just make it a >> >> tracepoint instead of a whole new hook? That's what I was planning to >> do >> >> for xdp_return_frame(); we just need a way to refer back to the >> original >> >> frame, but we'd need to solve that anyway. Just letting XDP/AF_XDP >> >> specify their own packet ID in some form, and make that part of the >> >> tracepoint data, would probably be sufficient? >> > >> > That would probably mean a driver specific tracepoint (since we might >> > need to get to the completion queue descriptor)? >> > Idk, probably still makes sense to have something that works across >> > different drivers? >> > Or are you suggesting to just do fentry/mlx5e_free_xdpsq_desc and go >> from there? >> > Not sure I can get a umem frame, as you've mentioned; and also it >> > doesn't look like cqe is there... > >> No, we'd define the tracepoint in one place (like along with the others >> in include/xdp/trace/event.h), and have it include the fields we need >> (ifindex, queue index, timestamp, some kind of packet ID). And then add >> a call in each driver that will support this, wherever it makes sense in >> that driver. This is what we do with the trace_xdp_exception() calls >> already scattered through drivers in the RX handling code. > >> With this, a BPF listener can attach to just the one tracepoint, and get >> events from all drivers that support it (but it can filter on ifindex >> for the events it's interested in). > >> This probably doesn't scale indefinitely: it's possible to add new >> fields to the tracepoint if we discover other uses than the timestamp, >> but this would probably get unwieldy at some point. However, it's a >> lightweight way to add a "hook" if we don't have any other use cases >> than getting the timestamp for now. > >> > I guess the fact that it would arrive out-of-band (not in a umem >> > frame) is a minor inconvenience, the userspace should be able to join >> > the data together hopefully. > >> Yeah, that's why I suggested the user-defined packet ID: if the event >> just includes that, the BPF or userspace program can do its own matching >> depending on its needs... > > I guess we can implement this user-defined packet ID as a TX metadata > as well? Some kind of u32 tag/mark that the XDP program can set? (and > some new AF_XDP TX hook where we can set it up as well) Yeah, good point, that would be an excellent way to resolve that particular issue :) -Toke