Re: [PATCH nf] netfilter: nfnetlink_queue: reroute reinjected packets from postrouting

Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> · Wed, 18 Sep 2024 10:30:20 +0200

Hi Antonio,

On Tue, Sep 17, 2024 at 11:01:31PM +0100, Antonio Ojea wrote:
> On Fri, 13 Sept 2024 at 07:24, Antonio Ojea
> <antonio.ojea.garcia@xxxxxxxxx> wrote:
> >
> > On Thu, 12 Sept 2024 at 20:58, Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote:
> > >
> > > 368982cd7d1b ("netfilter: nfnetlink_queue: resolve clash for unconfirmed
> > > conntracks") adjusts NAT again in case that packet loses race to confirm
> > > the conntrack entry.
> > >
> > > The reinject path triggers a route lookup again for the output hook, but
> > > not for the postrouting hook where queue to userspace is also possible.
> > >
> > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > Reported-by: Antonio Ojea <antonio.ojea.garcia@xxxxxxxxx>
> > > Signed-off-by: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>
> > > ---
> > > I tried but I am not managing to make a selftest that runs reliable.
> > > I can reproduce it manually and validate that this works.
> > >
> > > ./nft_queue -d 1000 helps by introducing a delay of 1000ms in the
> > > userspace queue processing which helps trigger the race more easily,
> > > socat needs to send several packets in the same UDP flow.
> > >
> > > @Antonio: Could you try this patch meanwhile there is a testcase for
> > > this.
> >
> > Let me test it and report back
> >
> 
> Ok, I finally managed to get this tested, and it does not seem to
> solve the problem, it keeps dnating twice after the packet is enqueued
> by nfqueue

dnatting twice is required to deal with the conntrack confirmation race.

packet 1 enters prerouting, dnat is done using IP A as destination
packet 2 enters prerouting, dnat is done using IP B as destination
packet 1 is enqueued to userspace from postrouting, with unconfirmed conntrack
packet 2 is enqueued to userspace from postrouting, with unconfirmed conntrack
packet 1 is reinjected back to kernelspace from postrouting, conntrack
is confirmed, it uses IP A as destination.
*packet 2* is reinjected back to kernelspace from postrouting, ct lookup
tells us packet lost race, unconfirmed conntrack is dropped, update
mangling to use IP A for consistency (because packet 2 is using IP).

So far we have been assuming races with conntrack confirmation are
unlikely, thus, this is scenario is handled as a corner case which
requires this double mangling.

You still see packets being dropped, right? That should not happen,
I might be then missing something else because my patch triggers also
a re-routing which is required because packet 2 was finally mangled
to use IP A while route still points to IP B.