Re: Redirect packet back to host stack after AF_XDP?

Toke Høiland-Jørgensen <toke@xxxxxxxxxx> · Wed, 11 Jan 2023 00:27:45 +0100

Vincent Li <vincent.mc.li@xxxxxxxxx> writes:

> On Tue, Jan 10, 2023 at 7:23 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
>>
>> Vincent Li <vincent.mc.li@xxxxxxxxx> writes:
>>
>> > On Mon, Jan 2, 2023 at 3:34 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
>> >>
>> >> Vincent Li <vincent.mc.li@xxxxxxxxx> writes:
>> >>
>> >> > On Wed, Dec 14, 2022 at 2:53 PM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
>> >> >>
>> >> >> Vincent Li <vincent.mc.li@xxxxxxxxx> writes:
>> >> >>
>> >> >> > Hi,
>> >> >> >
>> >> >> > If I have an user space stack like mTCP works on top of AF_XDP as tcp
>> >> >> > stateful packet filter to drop tcp packet like tcp syn/rst/ack flood
>> >> >> > or other tcp attack, and redirect good tcp packet back to linux host
>> >> >> > stack after mTCP filtering, is that possible?
>> >> >>
>> >> >> Not really, no. You can inject it using regular userspace methods (say,
>> >> >> a TUN device), or using AF_XDP on a veth device. But in both cases the
>> >> >> packet will come in on a different interface, so it's not really
>> >> >> transparent. And performance is not great either.
>> >> >
>> >> > I have thought about it more :) what about this scenario
>> >> >
>> >> >
>> >> > good tcp rst/ack or bad flooding rst/ack -> NIC1 -> mTCP+AF_XDP ->NIC2
>> >> >
>> >> > NIC1 and NIC2 on the same host, drop flooding rst/ack by mTCP,
>> >> > redirect good tcp rst/ack to NIC2, is that possible?
>> >>
>> >> You can do this if NIC2 is a veth device: you inject packets into the
>> >> veth on the TX side, they come out on the other side and from the kernel
>> >> PoV it looks like all packets come in on the peer veth. You'll need to
>> >> redirect packets the other way as well.
>> >>
>> >> > any performance impact?
>> >>
>> >> Yes, obviously :)
>> >>
>> >> >> In general, if you want to filter traffic before passing it on to the
>> >> >> kernel, the best bet is to implement your filtering in BPF and run it as
>> >> >> an XDP program.
>> >> >
>> >> > I am thinking for scenario like tcp rst/ack flood DDOS attack to NIC1
>> >> > above, I can't simply drop every rst/ack because there could be
>> >> > legitimate rst/ack, in this case since mTCP can validate legitimate
>> >> > stateful tcp connection, drop flooding rst/ack packet, redirect good
>> >> > rst/ack to NIC2. I am not sure a BPF XDP program attached to NIC1 is
>> >> > able to do stateful TCP packet filtering, does that make sense to you?
>> >>
>> >> It makes sense in the "it can probably be made to work" sense. Not in
>> >> the "why would anyone want to do this" sense. If you're trying to
>> >> protect against SYN flooding using XDP there are better solutions than
>> >> proxying things through a user space TCP stack. See for instance Maxim's
>> >> synproxy patches:
>> >>
>> >
>> > SYN flooding is just one of the example, what I have in mind is an
>> > user space TCP/IP stack runs on top of AF_XDP as middle box/proxy for
>> > packet filtering or load balancing, like F5 BIG-IP runs an user space
>> > TCP/IP stack on top of AF_XDP. I thought open source mTCP + AF_XDP
>> > could be a similar use case as middle box.  user space TCP/IP stack +
>> > AF_XDP as middle box/proxy,  the performance is not going to be good?
>>
>> Well, you can certainly build a proxy using AF_XDP by intercepting all
>> the traffic and bridging it onto a veth device, say. I've certainly
>> heard of people doing that. It'll have some non-trivial overhead,
>> though; even if AF_XDP is fairly high performance, you're still making
>> all traffic take an extra hop through userspace, and you'll lose
>> features like hardware TSO, etc. Whether it can be done with "good"
>> performance depends on your use case, I guess (i.e., how do you define
>> "good performance"?).
>>
>> I guess I don't really see the utility in having a user-space TCP stack
>> be a middlebox? If you're doing packet-level filtering, you could just
>> do that in regular XDP (and the same for load balancing, see e.g.,
>> Katran), and if you want to do application-level filtering (say, a WAF),
>> you could just use the kernel TCP stack?
>>
>
> the reason I mention user-space TCP stack is user space stack appears
> performs better than kernel TCP stack, and we see user-space stack +
> DPDK for high speed packet processing applications out there, since
> XDP/AF_XDP seems to be competing with DPDK, so I thought why not user
> space stack + AF_XDP :)

Well, there's a difference between running a user-level stack directly
in the end application, or using it as a middlebox. The latter just adds
overhead, and again, I really don't see why you'd want to do that?

Also, the mTCP web site cites tests against a 3.10 kernel, and the code
doesn't look like it's been touched for years. So I'd suggest running
some up-to-date tests against a modern kernel (and trying things like
io_uring if your concern is syscall overhead for small flows) before
drawing any conclusions about performance :)

That being said, it's certainly *possible* to do what you're suggesting;
there's even a PMD driver in DPDK for AF_XDP, so in that sense it's
pluggable. So, like, feel free to try it out? I'm just cautioning
against thinking it some kind of magic bullet; packet processing at high
speeds in software is *hard*, so the details matter a lot, and it's
really easy to throw away any performance gains by inefficiencies
elsewhere in the stack (which goes for both the kernel stack, XDP, and
AF_XDP).

-Toke