Re: [PATCH net 4/4] netfilter: nft_inner: incorrect percpu area handling under softirq

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 28, 2024 at 1:38 PM Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote:
>
> Softirq can interrupt packet from process context which walks over the
> percpu area.
>
> Add routines to disable bh while restoring and saving the tunnel parser
> context from percpu area to stack. Add a skbuff owner for this percpu
> area to catch softirq interference to exercise the packet tunnel parser
> again in such case.
>
> Reported-by: syzbot+84d0441b9860f0d63285@xxxxxxxxxxxxxxxxxxxxxxxxx
> Fixes: 3a07327d10a0 ("netfilter: nft_inner: support for inner tunnel header matching")
> Signed-off-by: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>
> ---
>  include/net/netfilter/nf_tables_core.h |  1 +
>  net/netfilter/nft_inner.c              | 56 ++++++++++++++++++++------
>  2 files changed, 45 insertions(+), 12 deletions(-)
>
> diff --git a/include/net/netfilter/nf_tables_core.h b/include/net/netfilter/nf_tables_core.h
> index ff27cb2e1662..dae0e7592934 100644
> --- a/include/net/netfilter/nf_tables_core.h
> +++ b/include/net/netfilter/nf_tables_core.h
> @@ -161,6 +161,7 @@ enum {
>  };
>
>  struct nft_inner_tun_ctx {
> +       struct sk_buff *skb;    /* percpu area owner */
>         u16     type;
>         u16     inner_tunoff;
>         u16     inner_lloff;
> diff --git a/net/netfilter/nft_inner.c b/net/netfilter/nft_inner.c
> index 928312d01eb1..fcaa126ac8da 100644
> --- a/net/netfilter/nft_inner.c
> +++ b/net/netfilter/nft_inner.c
> @@ -210,35 +210,65 @@ static int nft_inner_parse(const struct nft_inner *priv,
>                            struct nft_pktinfo *pkt,
>                            struct nft_inner_tun_ctx *tun_ctx)
>  {
> -       struct nft_inner_tun_ctx ctx = {};
>         u32 off = pkt->inneroff;
>
>         if (priv->flags & NFT_INNER_HDRSIZE &&
> -           nft_inner_parse_tunhdr(priv, pkt, &ctx, &off) < 0)
> +           nft_inner_parse_tunhdr(priv, pkt, tun_ctx, &off) < 0)
>                 return -1;
>
>         if (priv->flags & (NFT_INNER_LL | NFT_INNER_NH)) {
> -               if (nft_inner_parse_l2l3(priv, pkt, &ctx, off) < 0)
> +               if (nft_inner_parse_l2l3(priv, pkt, tun_ctx, off) < 0)
>                         return -1;
>         } else if (priv->flags & NFT_INNER_TH) {
> -               ctx.inner_thoff = off;
> -               ctx.flags |= NFT_PAYLOAD_CTX_INNER_TH;
> +               tun_ctx->inner_thoff = off;
> +               tun_ctx->flags |= NFT_PAYLOAD_CTX_INNER_TH;
>         }
>
> -       *tun_ctx = ctx;
>         tun_ctx->type = priv->type;
> +       tun_ctx->skb = pkt->skb;
>         pkt->flags |= NFT_PKTINFO_INNER_FULL;
>
>         return 0;
>  }
>
> +static bool nft_inner_restore_tun_ctx(const struct nft_pktinfo *pkt,
> +                                     struct nft_inner_tun_ctx *tun_ctx)
> +{
> +       struct nft_inner_tun_ctx *this_cpu_tun_ctx;
> +
> +       local_bh_disable();
> +       this_cpu_tun_ctx = this_cpu_ptr(&nft_pcpu_tun_ctx);
> +       if (this_cpu_tun_ctx->skb != pkt->skb) {

I must say I do not understand this patch.

If a context is used by a save/restore more than one time per packet
traversal, then this means we can not use per-cpu storage,
or risk flakes.

Also, skb could be freed and re-allocated ?

Perhaps describe a bit more what is going on in the changelog.





[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux