Re: [PATCH v2 1/2] netfilter: conntrack: introduce no_random_port proc entry

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 03, 2022 at 08:02:08PM +0000, Sriram Yagnaraman wrote:
> On 2022-11-02 15:00, Florian Westphal wrote:
>
> > Sriram Yagnaraman <sriram.yagnaraman@xxxxxxxx> wrote:
> >> On 2022-10-31 09:38, Florian Westphal wrote:
> >>
> >>> sriram.yagnaraman@xxxxxxxx <sriram.yagnaraman@xxxxxxxx> wrote:
> >>>> From: Sriram Yagnaraman <sriram.yagnaraman@xxxxxxxx>
> >>>>
> >>>> This patch introduces a new proc entry to disable source port
> >>>> randomization for SCTP connections.
> >>> Hmm.  Can you elaborate?  The sport is never randomized, unless either
> >>> 1. User explicitly requested it via "random" flag passed to snat rule, or
> >>> 2. the is an existing connection, using the *same* sport:saddr -> daddr:dport
> >>>    quadruple as the new request.
> >>>
> >>> In 2), this new toggle prevents communication.  So I wonder why ...
> >> Thank you so much for the detailed review comments.
> >>
> >> My use case for this flag originates from a deployment of SCTP client
> >> endpoints on docker/kubernetes environments, where typically there exists
> >> SNAT rules for the endpoints on egress. The *user* in this case are the
> >> CNI plugins that configure the SNAT rules, and some of the most common
> >> plugins use --random-fully regardless of the protocol.
> >>
> >> Consider an SCTP association A -> B, which has two paths via NAT A and B
> >> A: 1.2.3.4:12345
> >> B: 5.6.7.8/9:42
> >> NAT A: 1.2.31.4 (used for path towards 5.6.7.8)
> >> NAT B: 1.2.32.4 (used for path towards 5.6.7.9)
> >>
> >>               ┌───────┐   ┌───┐
> >>            ┌──► NAT A ├───►   │
> >>  ┌─────┐   │  └───────┘   │   │
> >>  │  A  ├───┤              │ B │
> >>  └─────┘   │  ┌───────┐   │   │
> >>            └──► NAT B ├───►   │
> >>               └───────┘   └───┘
> >>
> >> Let us assume in NAT A (1.2.31.4), the connections is setup as
> >> 	ORIGINAL TUPLE		    REPLY TUPLE
> >> 1.2.3.4:12345 -> 5.6.7.8:42, 5.6.7.8.42 -> 1.2.31.4:33333
> >>
> >> Let us assume in NAT B (1.2.32.4), the connections is setup as
> >> 	ORIGINAL TUPLE		    REPLY TUPLE
> >> 1.2.3.4:12345 -> 5.6.7.9:42, 5.6.7.8.42 -> 1.2.32.4:44444
> >>
> >> Since the port numbers are different when viewed from B, the association
> >> will not become multihomed, with only the primary path being active.
> >> Moreover, on a NAT/middlebox restart, we will end up getting new ports.
> >>
> >> I understand this is a problem in the way SNAT rules are configured, my
> >> proposal was to have this flag as a means of preventing such a problem
> >> even if the user wanted to.
> > Ugh, sorry, but that sounds just wrong.
>
> Ok, I hear that. :)
>
> >
> >>>> As specified in RFC9260 all transport addresses used by an SCTP endpoint
> >>>> MUST use the same port number but can use multiple IP addresses. That
> >>>> means that all paths taken within an SCTP association should have the
> >>>> same port even if they pass through different NAT/middleboxes in the
> >>>> network.
> > Hmm, I don't understand WHY this requirement exists, since endpoints
> > cannot control source port (or source address) seen by the peer;
> > NAT won't go away.
> >
> > I read that snippet several times and its not clear to me if
> > "port number" refers to sport or dport.  Dport would make sense to me,
> > but sport...?  No, not really.
>
> I am just an interpreter of the standard but AFAIU, port means both source
> and destination port. Section 1.3 of RFC 9260 defining an SCTP endpoint.
> In any case, running SCTP on UDP is probably the best way to workaround
> the SCTP NAT problem.
>
> >
> > Won't the endpoints notice that the path is down and re-create the flow?
> >
> > AFAIU the root cause of your problem is that:
> > 1. NAT middleboxes remap source port AND
> > 2. NAT middleboxes restart frequently
> >
> > ... so fixing either 1 or 2 would avoid the problem.
> >
> > I don't think adding sysctls to override 1) is a sane option.
>
> Yeah the endpoints does try to re-create the flows, but if we have
> multiple middle boxes remapping the source port, there is no guarantee
> that they will remap to the same source port.
> 1) is the main problem that I was trying to address with this patch.
>
> >> Since the flag is optional, the idea is to enable it only on hosts that
> >> are part of docker/kubernetes environments and use NAT in their datapath.
> > We can't fix the ruleset but we can somehow cure it via sysctl in each netns?
> > I don't like this.
> >
> > NAT middlebox restart with --random is a problem in any case, not just
> > for SCTP, because the chosen "random port" is lost.
> >
> > I don't see a way to fix this, unless NOT using --random mode.
> > If connection is subject to sequence number rewrite (for tcp)
> > the connection won't survive either as the sejadj state is lost.
>
> Ok, I understand your point. I agree it doesn't make sense to have an
> alternative configuration option to avoid this problem. I will try to
> convince the "users" if --random-fully is not used for SCTP.

FWIW I share Florian's opinion here. With the explanations above, it
doesn't make sense to have an override in kernel for an option that
userspace is supplying at will.

  Marcelo





[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux