On 19/06/2020 17:15, Pablo Neira Ayuso wrote: >>>>> Why not make a patch to publicly expose the skb's data via nft_meta? >>>>> No more custom modules, no more userspace modifications [..] >>>> >>>> For our particular use case, we are running the skb through the kernel >>>> function `skb_validate_network_len()` with custom mtu size [..] (the function name is skb_gso_validate_network_len, my mistake) I previously expressed strong opinion that our "hack" to send icmp rejects on Layer 2 will not be useful for anyone else. But the existence of the commit from Michael Braun proves that I was wrong, and Jan Engelhards was right: it probably makes sense to implement the functionality that we need within the "new" nft infrastructure. As far as I understand, the part that is missing in the existing implementation is exposure (in some form) of `skb_gso_validate_network_len()` function to user-configurable filters. Because the kernel does now expose the _size_ under which a gso skb can be segmented, but only the _boolean_ with the meaning "this gso skb can fit in mtu that you've specified", I could envision a new match that could be named like "fits-in-mtu-size" or "segmentable-under". Then an nftables rule could look roughly like this (for ipv4): nft insert rule bridge filter FORWARD \ ip frag-off & 0x4000 != 0 \ ip protocol tcp \ not tcp segmentable-under 1400 \ reject with icmp type frag-needed This new function would act the same as "ip len < XXX" for non-gso skbs, and call skb_gso_validate_network_len(skb, XXX) for gso skbs. Do you think it makes sense? Shall I try to implement this and submit a patch? Thank you, Eugene >>> I find no such function in the current or past kernels. Perhaps you could post >>> the code of the module(s) you already have, and we can assess if it, or the >>> upstream ideals, can be massaged to make the code stick. >> >> I really really don't see our module being useful for anyone else! Even >> for us, it's just a stopgap measure, hopefully to be dropped after a few >> months. That said, I believe that the company will have no objections >> against publishing it. I've uploaded initial (untested) code on github >> here https://github.com/crosser/ebt-pmtud, in case anyone is interested. > > I think there is a way to achieve this with nft 0.9.6 ? > > commit 2a20b5bdbde8a1b510f75b1522772b07e51a77d7 > Author: Michael Braun <...> > Date: Wed May 6 11:46:23 2020 +0200 > > datatype: add frag-needed (ipv4) to reject options > > This enables to send icmp frag-needed messages using reject target. > > I have a bridge with connects an gretap tunnel with some ethernet lan. > On the gretap device I use ignore-df to avoid packets being lost without > icmp reject to the sender of the bridged packet. > > Still I want to avoid packet fragmentation with the gretap packets. > So I though about adding an nftables rule like this: > > nft insert rule bridge filter FORWARD \ > ip protocol tcp \ > ip length > 1400 \ > ip frag-off & 0x4000 != 0 \ > reject with icmp type frag-needed > > This would reject all tcp packets with ip dont-fragment bit set that are > bigger than some threshold (here 1400 bytes). The sender would then receive > ICMP unreachable - fragmentation needed and reduce its packet size (as > defined with PMTU). >
Attachment:
signature.asc
Description: OpenPGP digital signature