Re: [LSF/MM/BPF TOPIC] XDP metadata for TX

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Stanislav Fomichev <sdf@xxxxxxxxxx> writes:

> On Thu, Feb 23, 2023 at 3:22 PM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
>>
>> Stanislav Fomichev <sdf@xxxxxxxxxx> writes:
>>
>> > I'd like to discuss a potential follow up for the previous "XDP RX
>> > metadata" series [0].
>> >
>> > Now that we can access (a subset of) packet metadata at RX, I'd like to
>> > explore the options where we can export some of that metadata on TX. And
>> > also whether it might be possible to access some of the TX completion
>> > metadata (things like TX timestamp).
>> >
>> > I'm currently trying to understand whether the same approach I've used
>> > on RX could work at TX. By May I plan to have a bunch of options laid
>> > out (currently considering XSK tx/compl programs and XDP tx/compl
>> > programs) so we have something to discuss.
>>
>> I've been looking at ways of getting a TX-completion hook for the XDP
>> queueing stuff as well. For that, I think it could work to just hook
>> into xdp_return_frame(), but if you want to access hardware metadata
>> it'll obviously have to be in the driver. A hook in the driver could
>> certainly be used for the queueing return as well, though, which may
>> help making it worth the trouble :)
>
> Yeah, I'd like to get to completion descriptors ideally; so nothing
> better than a driver hook comes to mind so far :-(
> (I'm eye-balling mlx5's mlx5e_free_xdpsq_desc AF_XDP path mostly so far).

Is there any other use case for this than getting the TX timestamp? Not
really sure what else those descriptors contain...

>> > I'd like to some more input on whether applying the same idea on TX
>> > makes sense or not and whether there are any sensible alternatives.
>> > (IIRC, there was an attempt to do XDP on egress that went nowhere).
>>
>> I believe that stranded because it was deemed not feasible to cover the
>> SKB TX path as well, which means it can't be symmetrical to the RX hook.
>> So we ended up with the in-devmap hook instead. I'm not sure if that's
>> made easier by multi-buf XDP, so that may be worth revisiting.
>>
>> For the TX metadata you don't really have to care about the skb path, I
>> suppose, so that may not matter too much either. However, at least for
>> the in-kernel xdp_frame the TX path is pushed from the stack anyway, so
>> I'm not sure if it's worth having a separate hook in the driver (with
>> all the added complexity and overhead that entails) just to set
>> metadata? That could just as well be done on push from higher up the
>> stack; per-driver kfuncs could still be useful for this, though.
>>
>> And of course something would be needed so that that BPF programs can
>> process AF_XDP frames in the kernel before they hit the driver, but
>> again I'm not sure that needs to be a hook in the driver.
>
> Care to elaborate more on "push from higher up the stack"?

I'm referring to the XDP_REDIRECT path here: xdp_frames are transmitted
by the stack calling ndo_xdp_xmit() in the driver with an array of
frames that are immediately put on the wire (see bq_xmit_all() in
devmap.c). So any metadata writing could be done at that point, since
the target driver is already known; there's even already a program hook
in there (used for in-devmap programs).

> I've been thinking about mostly two cases:
> - XDP_TX - I think this one technically doesn't need an extra hook;
> all metadata manipulations can be done at xdp_rx? (however, not sure
> how real that is, since the descriptors are probably not exposed over
> there?)

Well, to me XDP_REDIRECT is the most interesting one (see above). I
think we could even drop the XDP_TX case and only do this for
XDP_REDIRECT, since XDP_TX is basically a special-case optimisation.
I.e., it's possible to XDP_REDIRECT back to the same device, the frames
will just take a slight detour up through the stack; but that could also
be a good thing if it means we'll have to do less surgery to the drivers
to implement this for two paths.

It does have the same challenge as you outlined above, though: At that
point the TX descriptor probably doesn't exist, so the driver NDO will
have to do something else with the data; but maybe we can solve that
without moving the hook into the driver itself somehow?

> - AF_XDP TX - this one needs something deep in the driver (due to tx
> zc) to populate the descriptors?

Yeah, this one is a bit more challenging, but having a way to process
AF_XDP frames in the kernel before they're sent out would be good in any
case (for things like policing what packets an AF_XDP application can
send in a cloud deployment, for instance). Would be best if we could
consolidate the XDP_REDIRECT and AF_XDP paths, I suppose...

> - anything else?

Well, see above ;)

>> In any case, the above is just my immediate brain dump (I've been
>> mulling these things over for a while in relation to the queueing
>> stuff), and I'd certainly welcome more discussion on the subject! :)
>
> Awesome, thanks for the dump! Will try to keep you in the loop!

Great, thanks!

-Toke




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux