Vincent Li <vincent.mc.li@xxxxxxxxx> writes: > On Tue, Jan 10, 2023 at 7:23 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote: >> >> Vincent Li <vincent.mc.li@xxxxxxxxx> writes: >> >> > On Mon, Jan 2, 2023 at 3:34 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote: >> >> >> >> Vincent Li <vincent.mc.li@xxxxxxxxx> writes: >> >> >> >> > On Wed, Dec 14, 2022 at 2:53 PM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote: >> >> >> >> >> >> Vincent Li <vincent.mc.li@xxxxxxxxx> writes: >> >> >> >> >> >> > Hi, >> >> >> > >> >> >> > If I have an user space stack like mTCP works on top of AF_XDP as tcp >> >> >> > stateful packet filter to drop tcp packet like tcp syn/rst/ack flood >> >> >> > or other tcp attack, and redirect good tcp packet back to linux host >> >> >> > stack after mTCP filtering, is that possible? >> >> >> >> >> >> Not really, no. You can inject it using regular userspace methods (say, >> >> >> a TUN device), or using AF_XDP on a veth device. But in both cases the >> >> >> packet will come in on a different interface, so it's not really >> >> >> transparent. And performance is not great either. >> >> > >> >> > I have thought about it more :) what about this scenario >> >> > >> >> > >> >> > good tcp rst/ack or bad flooding rst/ack -> NIC1 -> mTCP+AF_XDP ->NIC2 >> >> > >> >> > NIC1 and NIC2 on the same host, drop flooding rst/ack by mTCP, >> >> > redirect good tcp rst/ack to NIC2, is that possible? >> >> >> >> You can do this if NIC2 is a veth device: you inject packets into the >> >> veth on the TX side, they come out on the other side and from the kernel >> >> PoV it looks like all packets come in on the peer veth. You'll need to >> >> redirect packets the other way as well. >> >> >> >> > any performance impact? >> >> >> >> Yes, obviously :) >> >> >> >> >> In general, if you want to filter traffic before passing it on to the >> >> >> kernel, the best bet is to implement your filtering in BPF and run it as >> >> >> an XDP program. >> >> > >> >> > I am thinking for scenario like tcp rst/ack flood DDOS attack to NIC1 >> >> > above, I can't simply drop every rst/ack because there could be >> >> > legitimate rst/ack, in this case since mTCP can validate legitimate >> >> > stateful tcp connection, drop flooding rst/ack packet, redirect good >> >> > rst/ack to NIC2. I am not sure a BPF XDP program attached to NIC1 is >> >> > able to do stateful TCP packet filtering, does that make sense to you? >> >> >> >> It makes sense in the "it can probably be made to work" sense. Not in >> >> the "why would anyone want to do this" sense. If you're trying to >> >> protect against SYN flooding using XDP there are better solutions than >> >> proxying things through a user space TCP stack. See for instance Maxim's >> >> synproxy patches: >> >> >> > >> > SYN flooding is just one of the example, what I have in mind is an >> > user space TCP/IP stack runs on top of AF_XDP as middle box/proxy for >> > packet filtering or load balancing, like F5 BIG-IP runs an user space >> > TCP/IP stack on top of AF_XDP. I thought open source mTCP + AF_XDP >> > could be a similar use case as middle box. user space TCP/IP stack + >> > AF_XDP as middle box/proxy, the performance is not going to be good? >> >> Well, you can certainly build a proxy using AF_XDP by intercepting all >> the traffic and bridging it onto a veth device, say. I've certainly >> heard of people doing that. It'll have some non-trivial overhead, >> though; even if AF_XDP is fairly high performance, you're still making >> all traffic take an extra hop through userspace, and you'll lose >> features like hardware TSO, etc. Whether it can be done with "good" >> performance depends on your use case, I guess (i.e., how do you define >> "good performance"?). >> >> I guess I don't really see the utility in having a user-space TCP stack >> be a middlebox? If you're doing packet-level filtering, you could just >> do that in regular XDP (and the same for load balancing, see e.g., >> Katran), and if you want to do application-level filtering (say, a WAF), >> you could just use the kernel TCP stack? >> > > the reason I mention user-space TCP stack is user space stack appears > performs better than kernel TCP stack, and we see user-space stack + > DPDK for high speed packet processing applications out there, since > XDP/AF_XDP seems to be competing with DPDK, so I thought why not user > space stack + AF_XDP :) Well, there's a difference between running a user-level stack directly in the end application, or using it as a middlebox. The latter just adds overhead, and again, I really don't see why you'd want to do that? Also, the mTCP web site cites tests against a 3.10 kernel, and the code doesn't look like it's been touched for years. So I'd suggest running some up-to-date tests against a modern kernel (and trying things like io_uring if your concern is syscall overhead for small flows) before drawing any conclusions about performance :) That being said, it's certainly *possible* to do what you're suggesting; there's even a PMD driver in DPDK for AF_XDP, so in that sense it's pluggable. So, like, feel free to try it out? I'm just cautioning against thinking it some kind of magic bullet; packet processing at high speeds in software is *hard*, so the details matter a lot, and it's really easy to throw away any performance gains by inefficiencies elsewhere in the stack (which goes for both the kernel stack, XDP, and AF_XDP). -Toke