Re: [RFC PATCH] netfilter: conntrack: simplify sctp state machine

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 11, 2023 at 09:36:38AM +0000, Sriram Yagnaraman wrote:
> > -----Original Message-----
> > From: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>
> > Sent: Friday, 6 January 2023 01:50
> > To: Sriram Yagnaraman <sriram.yagnaraman@xxxxxxxx>
> > Cc: netfilter-devel@xxxxxxxxxxxxxxx; Florian Westphal <fw@xxxxxxxxx>;
> > Marcelo Ricardo Leitner <mleitner@xxxxxxxxxx>; Long Xin
> > <lxin@xxxxxxxxxx>; Claudio Porfiri <claudio.porfiri@xxxxxxxxxxxx>
> > Subject: Re: [RFC PATCH] netfilter: conntrack: simplify sctp state machine
> > 
> > On Thu, Jan 05, 2023 at 12:11:44PM +0000, Sriram Yagnaraman wrote:
> > > > -----Original Message-----
> > > > From: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>
> > > > Sent: Thursday, 5 January 2023 12:54
> > > > To: Sriram Yagnaraman <sriram.yagnaraman@xxxxxxxx>
> > > > Cc: netfilter-devel@xxxxxxxxxxxxxxx; Florian Westphal
> > > > <fw@xxxxxxxxx>; Marcelo Ricardo Leitner <mleitner@xxxxxxxxxx>; Long
> > > > Xin <lxin@xxxxxxxxxx>; Claudio Porfiri
> > > > <claudio.porfiri@xxxxxxxxxxxx>
> > > > Subject: Re: [RFC PATCH] netfilter: conntrack: simplify sctp state
> > > > machine
> > > >
> > > > On Thu, Jan 05, 2023 at 11:41:13AM +0000, Sriram Yagnaraman wrote:
> > > > > > -----Original Message-----
> > > > > > From: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>
> > > > > > Sent: Wednesday, 4 January 2023 16:02
> > > > > > To: Sriram Yagnaraman <sriram.yagnaraman@xxxxxxxx>
> > > > > > Cc: netfilter-devel@xxxxxxxxxxxxxxx; Florian Westphal
> > > > > > <fw@xxxxxxxxx>; Marcelo Ricardo Leitner <mleitner@xxxxxxxxxx>;
> > > > > > Long Xin <lxin@xxxxxxxxxx>
> > > > > > Subject: Re: [RFC PATCH] netfilter: conntrack: simplify sctp
> > > > > > state machine
> > > > > >
> > > > > > On Wed, Jan 04, 2023 at 12:31:43PM +0100, Sriram Yagnaraman
> > wrote:
> > > > > > > All the paths in an SCTP connection are kept alive either by
> > > > > > > actual DATA/SACK running through the connection or by HEARTBEAT.
> > > > > > > This patch proposes a simple state machine with only two
> > > > > > > states OPEN_WAIT and ESTABLISHED (similar to UDP). The reason
> > > > > > > for this change is a full stateful approach to SCTP is
> > > > > > > difficult when the association is multihomed since the
> > > > > > > endpoints could use different paths in the network during the lifetime
> > of an association.
> > > > > >
> > > > > > Do you mean the router/firewall might not see all packets for
> > > > > > association is multihomed?
> > > > > >
> > > > > > Could you please provide an example?
> > > > >
> > > > > Let's say the primary and alternate/secondary paths between the
> > > > > SCTP endpoints traverse different middle boxes. If an SCTP
> > > > > endpoint detects network failure on the primary path, it will
> > > > > switch to using the secondary path and all subsequent packets will
> > > > > not be seen by the middlebox on the primary path, including
> > > > > SHUTDOWN sequences if they happen at that time.
> > > >
> > > > OK, then on the primary middle box the SCTP flow will just timeout?
> > > > (because no more packets are seen).
> > >
> > > Yes, they will timeout unless the primary path comes up before the
> > > SHUTDOWN sequence. And the default timeout for an ESTABLISHED SCTP
> > > connection is 5 days, which is a "long" time to clean-up this entry.
> > 
> > Does the middle box have a chance to see any packet that provides a hint to
> > shorten this timeout? no HEARTBEAT packets are seen in this case on the
> > former primary path?
> 
> There will be HEARTBEAT as soon as a path becomes unreachable from
> the SCTP endpoints. But depending on the location of the network
> failure, the middlebox may or may not see the HEARTBEAT.

Conntrack assumes you have see all traffic that belongs the flow for
other protocols too.

> Also, HEARTBEAT is sent when there are no data to be transmitted or
> if the path is unreachable/unconfirmed, so I think there is no
> deterministic way of finding out when to shorten the timeout. Of
> course, a user has the option of setting the ESTABLISHED state
> timeout to a more reasonable value, for e.g., same as the
> HEARTBEAT_ACKED state timeout (210 sec), OR we could reduce the
> default timeout of ESTABLISHED to 210 sec.

Then just set up a short ESTABLISHED when multihoming is in place
since the beginning.

> > What I am missing are a more detailed list of issues with the existing
> > approach. Your patch description says "SCTP tracking with multihoming is
> > difficult", probably a list of scenarios would help to understand the motivation
> > to simplify the state machine.
> 
> Thank you for reviewing and asking these questions, it made me step back and think. I list below some background
> - I want to simplify the state machine, because it is possible to
> track an SCTP connection with fewer states, for e.g., SCTP with UDP
> encapsulation uses UDP conntrack with just UNREPLIED/REPLIED states
> and it works perfectly fine

I think it would preferrable to add some configuration via ruleset to
track SCTP over UDP, rather than deranking SCTP to become almost
stateless.

> - My stakeholders, at the behest of whom I am proposing these
> changes hit some problems running SCTP client endpoints behind NAT
> (inside Kubernetes pods) towards multihomed SCTP server endpoints
> (1a-g) and (2a-c) below
> - Some upcoming SCTP protocol changes in IETF (if
> approved/implemented) will make it hard to read beyond the SCTP
> common header, for e.g., DTLS over SCTP
> https://datatracker.ietf.org/doc/draft-ietf-tsvwg-dtls-over-sctp-bis/,
> proposes to encrypt all SCTP chunks, conntrack will only be able to
> see SCTP common header, these changes hopefully will make it easier
> to adapt to such changes in SCTP protocol - While at it, I also made
> some other "improvements"

For this DTLS case it should be possible to fall back to the SCTP
"stateless" approach.

> 	a) Avoid multiple walk-throughs of SCTP chunks in sctp_new(), sctp_basic_checks() and nf_conntrack_sctp_packet(), and parse it only once
> 	b) SCTP conntrack has the same state regardless of it is a primary or a secondary path
> 
> Let's say there are two SCTP endpoints A and B with addresses A' and B, B'' correspondingly.
> Primary path is A' <----> B' that traverses middlebox C, and secondary path is A' <----> B'' that traverses middlebox D.
> 1) SHUTDOWN sent on secondary path
> 1a) SCTP endpoint A sets up an association towards SCTP endpoint B
> 1b) Middlebox C sees INIT sequence and creates "primary" conntrack entry (5 days)
> 1c) Middlebox D sees HEARTBEAT sequence and creates "secondary" conntrack entry (210 seconds)
> 1d) Path failure between A and C, and SCTP endpoint A switches to secondary path and continues sending data on the association
> 1e) SCTP endpoint A decides to SHUTDOWN the connection
> 1f) Middlebox C is in ESTABLISHED state, doesn't see any SHUTDOWN sequence or HEARTBEAT, waits for timeout (5 days)
> 1g) Middlebox D is in HEARTBEAT_ACKED state, doesn't care about SHUTDOWN sequence, waits for timeout (210 seconds)

I guess similar problem will occur with MP-TCP, and I am not sure
taking TCP to be more stateless is the way to address this.

> 2) Recently fixed by bff3d0534804 ("netfilter: conntrack: add sctp DATA_SENT state ")
> 2a) SCTP endpoint A sets up an association towards SCTP endpoint B
> 2b) Middlebox C sees INIT sequence and creates "primary" conntrack entry (5 days)
> 2c) Middlebox D sees DATA/SACK, and DROPS packets until HEARTBEAT is seen to setup "secondary" conntrack entry (210 seconds)

I assume this is already fixed.

Another possibility would be to introduce this alternative
state-machine and use it for multihoming?



[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux