On Fri, Mar 17, 2023 at 10:31:43PM +0000, Jeremy Sowden wrote: > On 2023-03-16, at 10:23:34 +0100, Florian Westphal wrote: > > Jeremy Sowden <jeremy@xxxxxxxxxx> wrote: > > > The xt_dccp iptables module supports the matching of DCCP packets based > > > on the presence or absence of DCCP options. Extend nft_exthdr to add > > > this functionality to nftables. > > > > > > Link: https://bugzilla.netfilter.org/show_bug.cgi?id=930 > > > Signed-off-by: Jeremy Sowden <jeremy@xxxxxxxxxx> > > > --- > > > include/uapi/linux/netfilter/nf_tables.h | 2 + > > > net/netfilter/nft_exthdr.c | 105 +++++++++++++++++++++++ > > > +struct nft_exthdr_dccp { > > > + struct nft_exthdr exthdr; > > > + /* A buffer into which to copy the DCCP packet options for parsing. The > > > + * options are located between the packet header and its data. The > > > + * offset of the data from the start of the header is stored in an 8-bit > > > + * field as the number of 32-bit words, so the options will definitely > > > + * be shorter than `4 * U8_MAX` bytes. > > > + */ > > > + u8 optbuf[4 * U8_MAX]; > > > +}; > > > + > > > static unsigned int optlen(const u8 *opt, unsigned int offset) > > > { > > > /* Beware zero-length options: make finite progress */ > > > @@ -406,6 +418,70 @@ static void nft_exthdr_sctp_eval(const struct nft_expr *expr, > > > regs->verdict.code = NFT_BREAK; > > > } > > > > > > +static void nft_exthdr_dccp_eval(const struct nft_expr *expr, > > > + struct nft_regs *regs, > > > + const struct nft_pktinfo *pkt) > > > +{ > > > + struct nft_exthdr_dccp *priv_dccp = nft_expr_priv(expr); > > > + struct nft_exthdr *priv = &priv_dccp->exthdr; > > > + u32 *dest = ®s->data[priv->dreg]; > > > + unsigned int optoff, optlen, i; > > > + const struct dccp_hdr *dh; > > > + struct dccp_hdr _dh; > > > + const u8 *options; > > > + > > > + if (pkt->tprot != IPPROTO_DCCP || pkt->fragoff) > > > + goto err; > > > + > > > + dh = skb_header_pointer(pkt->skb, nft_thoff(pkt), sizeof(_dh), &_dh); > > > + if (!dh) > > > + goto err; > > > + > > > + if (dh->dccph_doff * 4 < __dccp_hdr_len(dh)) > > > + goto err; > > > + > > > + optoff = __dccp_hdr_len(dh); > > > + optlen = dh->dccph_doff * 4 - optoff; > > > > Perhaps reorder this slightly: > > > > optoff = __dccp_hdr_len(dh); > > if (dh->dccph_doff * 4 <= optoff) > > goto err; > > > > optlen = dh->dccph_doff * 4 - optoff; > > > > options = skb_header_pointer(pkt->skb, nft_thoff(pkt) + optoff, optlen, > > priv_dccp->optbuf); > > > > This isn't safe. priv_dccp->optbuf is neither percpu nor is there > > something that prevents a softinterrupt from firing. > > > > I suggest you have a look at 'pipapo' set type which uses percpu scratch > > maps. > > > > Yet another alternative is to provide a small onstack scratch buffer, > > say 256 byte, and fall back to kmalloc for larger spaces. > > > > Or, always use a on-stack buffer that gets re-used for each of the > > parsed options. > > On giving it some more thought, it occurred to me that there would be no > need to read the option data, only the type and length, so one could do > something like this: > > optoff = __dccp_hdr_len(dh); > if (dh->dccph_doff * 4 <= optoff) > goto err; > > optlen = dh->dccph_doff * 4 - optoff; > > for (i = 0; i < optlen; ) { > u8 buf[2], *ptr, type, len; > > ptr = skb_header_pointer(pkt->skb, thoff + optoff + i, > optlen - i > 1 ? 2 : 1, &buf); > if (!ptr) > goto err; > > type = ptr[0]; > > if (type <= 31) > len = 1; > else { > if (optlen - i < 2) > goto err; > > len = ptr[1]; > > if (len < 2) > goto err; > } > > if (type == priv->type) { > *dest = 1; > return; > } > > i += len; > } I believe this should be fine.