RE: [RFC PATCH] netfilter: conntrack: simplify sctp state machine

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Marcelo Ricardo Leitner <mleitner@xxxxxxxxxx>
> Sent: Wednesday, 11 January 2023 14:54
> To: Sriram Yagnaraman <sriram.yagnaraman@xxxxxxxx>
> Cc: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>; netfilter-devel@xxxxxxxxxxxxxxx;
> Florian Westphal <fw@xxxxxxxxx>; Long Xin <lxin@xxxxxxxxxx>; Claudio
> Porfiri <claudio.porfiri@xxxxxxxxxxxx>
> Subject: Re: [RFC PATCH] netfilter: conntrack: simplify sctp state machine
> 
> On Wed, Jan 11, 2023 at 09:36:38AM +0000, Sriram Yagnaraman wrote:
> > > -----Original Message-----
> > > From: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>
> > > Sent: Friday, 6 January 2023 01:50
> > > To: Sriram Yagnaraman <sriram.yagnaraman@xxxxxxxx>
> > > Cc: netfilter-devel@xxxxxxxxxxxxxxx; Florian Westphal
> > > <fw@xxxxxxxxx>; Marcelo Ricardo Leitner <mleitner@xxxxxxxxxx>; Long
> > > Xin <lxin@xxxxxxxxxx>; Claudio Porfiri
> > > <claudio.porfiri@xxxxxxxxxxxx>
> > > Subject: Re: [RFC PATCH] netfilter: conntrack: simplify sctp state
> > > machine
> > >
> > > On Thu, Jan 05, 2023 at 12:11:44PM +0000, Sriram Yagnaraman wrote:
> > > > > -----Original Message-----
> > > > > From: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>
> > > > > Sent: Thursday, 5 January 2023 12:54
> > > > > To: Sriram Yagnaraman <sriram.yagnaraman@xxxxxxxx>
> > > > > Cc: netfilter-devel@xxxxxxxxxxxxxxx; Florian Westphal
> > > > > <fw@xxxxxxxxx>; Marcelo Ricardo Leitner <mleitner@xxxxxxxxxx>;
> > > > > Long Xin <lxin@xxxxxxxxxx>; Claudio Porfiri
> > > > > <claudio.porfiri@xxxxxxxxxxxx>
> > > > > Subject: Re: [RFC PATCH] netfilter: conntrack: simplify sctp
> > > > > state machine
> > > > >
> > > > > On Thu, Jan 05, 2023 at 11:41:13AM +0000, Sriram Yagnaraman wrote:
> > > > > > > -----Original Message-----
> > > > > > > From: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>
> > > > > > > Sent: Wednesday, 4 January 2023 16:02
> > > > > > > To: Sriram Yagnaraman <sriram.yagnaraman@xxxxxxxx>
> > > > > > > Cc: netfilter-devel@xxxxxxxxxxxxxxx; Florian Westphal
> > > > > > > <fw@xxxxxxxxx>; Marcelo Ricardo Leitner
> > > > > > > <mleitner@xxxxxxxxxx>; Long Xin <lxin@xxxxxxxxxx>
> > > > > > > Subject: Re: [RFC PATCH] netfilter: conntrack: simplify sctp
> > > > > > > state machine
> > > > > > >
> > > > > > > On Wed, Jan 04, 2023 at 12:31:43PM +0100, Sriram Yagnaraman
> > > wrote:
> > > > > > > > All the paths in an SCTP connection are kept alive either
> > > > > > > > by actual DATA/SACK running through the connection or by
> HEARTBEAT.
> > > > > > > > This patch proposes a simple state machine with only two
> > > > > > > > states OPEN_WAIT and ESTABLISHED (similar to UDP). The
> > > > > > > > reason for this change is a full stateful approach to SCTP
> > > > > > > > is difficult when the association is multihomed since the
> > > > > > > > endpoints could use different paths in the network during
> > > > > > > > the lifetime
> > > of an association.
> > > > > > >
> > > > > > > Do you mean the router/firewall might not see all packets
> > > > > > > for association is multihomed?
> > > > > > >
> > > > > > > Could you please provide an example?
> > > > > >
> > > > > > Let's say the primary and alternate/secondary paths between
> > > > > > the SCTP endpoints traverse different middle boxes. If an SCTP
> > > > > > endpoint detects network failure on the primary path, it will
> > > > > > switch to using the secondary path and all subsequent packets
> > > > > > will not be seen by the middlebox on the primary path,
> > > > > > including SHUTDOWN sequences if they happen at that time.
> > > > >
> > > > > OK, then on the primary middle box the SCTP flow will just timeout?
> > > > > (because no more packets are seen).
> > > >
> > > > Yes, they will timeout unless the primary path comes up before the
> > > > SHUTDOWN sequence. And the default timeout for an ESTABLISHED
> SCTP
> > > > connection is 5 days, which is a "long" time to clean-up this entry.
> > >
> > > Does the middle box have a chance to see any packet that provides a
> > > hint to shorten this timeout? no HEARTBEAT packets are seen in this
> > > case on the former primary path?
> >
> > There will be HEARTBEAT as soon as a path becomes unreachable from the
> SCTP endpoints. But depending on the location of the network failure, the
> middlebox may or may not see the HEARTBEAT. Also, HEARTBEAT is sent
> when there are no data to be transmitted or if the path is
> unreachable/unconfirmed, so I think there is no deterministic way of finding
> out when to shorten the timeout. Of course, a user has the option of setting
> the ESTABLISHED state timeout to a more reasonable value, for e.g., same as
> the HEARTBEAT_ACKED state timeout (210 sec), OR we could reduce the
> default timeout of ESTABLISHED to 210 sec.
> >
> > >
> > > What I am missing are a more detailed list of issues with the
> > > existing approach. Your patch description says "SCTP tracking with
> > > multihoming is difficult", probably a list of scenarios would help
> > > to understand the motivation to simplify the state machine.
> >
> > Thank you for reviewing and asking these questions, it made me step
> > back and think. I list below some background
> > - I want to simplify the state machine, because it is possible to
> > track an SCTP connection with fewer states, for e.g., SCTP with UDP
> > encapsulation uses UDP conntrack with just UNREPLIED/REPLIED states
> > and it works perfectly fine
> 
> [*] (more below)
> 
> > - My stakeholders, at the behest of whom I am proposing these changes
> > hit some problems running SCTP client endpoints behind NAT (inside
> > Kubernetes pods) towards multihomed SCTP server endpoints (1a-g) and
> > (2a-c) below
> > - Some upcoming SCTP protocol changes in IETF (if
> > approved/implemented) will make it hard to read beyond the SCTP common
> > header, for e.g., DTLS over SCTP
> > https://datatracker.ietf.org/doc/draft-ietf-tsvwg-dtls-over-sctp-bis/,
> > proposes to encrypt all SCTP chunks, conntrack will only be able to
> > see SCTP common header, these changes hopefully will make it easier to
> > adapt to such changes in SCTP protocol
> 
> If this is really the case, then I'm confused by this new RFC (be warned I don't
> know the gory details of DTLS). I don't see it specifying that other chunk
> headers will be encrypted, but instead I see texts like:
> 
> 3.7.  Message Sizes
> 
>    ...
>    The sequence of DTLS records is then fragmented into DATA or I-DATA
>    Chunks to fit the path MTU by SCTP.  These changes ensure that DTLS/
>    SCTP has the same capability as SCTP to support user messages of any
>    size. ...
> 
> this is just packing encrypted payload.
> 
> Sections 4.1 and 4.5 also lead to that, and:
> 
> 6.  DTLS over SCTP Service
> 
>    The adoption of DTLS over SCTP according to the current specification
>    is meant to add to SCTP the option for transferring encrypted data.
>    When DTLS over SCTP is used, all data being transferred MUST be
>    protected by chunk authentication and DTLS encrypted.
> 
> and finally:
> 
> 9.6.  Privacy Considerations
> 
>    For each SCTP user message, the user also provides a stream
>    identifier, a flag to indicate whether the message is sent ordered or
>    unordered, and a payload protocol identifier.  Although DTLS/SCTP
>    provides privacy for the actual user message, the other three
>    information fields are not confidentiality protected.  They are sent
>    as clear text because they are part of the SCTP DATA chunk header.
> 
> 
> So AFAICU the claim here is that a middle box wouldn't be able to inspect
> SCTP payload, but the protocol itself is still possible.
> Did I miss something maybe?

Okay, my bad for using somebody else's words instead of reading the draft myself. You are right, the proposal is only to apply SCTP-AUTH to all the chunks, it doesn’t propose encryption/confidentiality. 
Anyhow that was just an example of why making a simplified state machine will avoid changing conntrack when the protocol changes are made, especially with the protocol still being actively improved.

> 
> > - While at it, I also made some other "improvements"
> > 	a) Avoid multiple walk-throughs of SCTP chunks in sctp_new(),
> sctp_basic_checks() and nf_conntrack_sctp_packet(), and parse it only once
> > 	b) SCTP conntrack has the same state regardless of it is a primary or
> > a secondary path
> 
> This a good change and may, actually, be the core reasoning behind the
> change here. 'Primary' is more in the sense of 'active' than anything else (and
> it gets more confusing with CMT-SCTP heh). It's not because a path saw the
> INIT handshake that it is more reliable and so from conntrack side.
> 
> With that, conntrack handling all paths similarly makes it more consistent and
> closer with what the actual SCTP sockets are seeing for such paths.
> 

Yes, I agree. 
 
> One easy example here of a bad situation due to the current difference
> between paths is that an application can start an association through a path
> and tear it down over another one. There's no shutdown seen by the first
> path, and the conntrack entry will be left there af it is was in Established state
> for 5 days. (this is also illustrated in Sriram's example below, just easier to
> read)
> 
> Item I marked with [*] above is vague. You need to consider what it would be
> loosing by such simplification and argument that it is okay.
> We can't just cut the fat out and pray for the best later on, right?
> But with this description here, it not only provides a solid justification on why
> it (specificly the new solution) is needed but also why it is currently already
> broken/not reliable in multi-homing setups. Yay! :-)

Agreed, I will update the change log with this description.

> 
> Then, next question is: would it make sense to keep the current processing for
> single path associations? That's a lot of code/maintenance and I fail to see a
> reasoning that justifies it.

I tried to simplify as much as possible, but of course, we can keep some states.
For e.g., keeping SHUTDOWN_* states can reduce the timeout of an ESTABLISHED connection even further. Of course, we will see that only on one of the paths.

> 
> >
> > Let's say there are two SCTP endpoints A and B with addresses A' and B, B''
> correspondingly.
> > Primary path is A' <----> B' that traverses middlebox C, and secondary path is
> A' <----> B'' that traverses middlebox D.
> > 1) SHUTDOWN sent on secondary path
> > 1a) SCTP endpoint A sets up an association towards SCTP endpoint B
> > 1b) Middlebox C sees INIT sequence and creates "primary" conntrack
> > entry (5 days)
> > 1c) Middlebox D sees HEARTBEAT sequence and creates "secondary"
> > conntrack entry (210 seconds)
> > 1d) Path failure between A and C, and SCTP endpoint A switches to
> > secondary path and continues sending data on the association
> > 1e) SCTP endpoint A decides to SHUTDOWN the connection
> > 1f) Middlebox C is in ESTABLISHED state, doesn't see any SHUTDOWN
> > sequence or HEARTBEAT, waits for timeout (5 days)
> > 1g) Middlebox D is in HEARTBEAT_ACKED state, doesn't care about
> > SHUTDOWN sequence, waits for timeout (210 seconds)
> >
> > 2) Recently fixed by bff3d0534804 ("netfilter: conntrack: add sctp
> > DATA_SENT state ")
> > 2a) SCTP endpoint A sets up an association towards SCTP endpoint B
> > 2b) Middlebox C sees INIT sequence and creates "primary" conntrack
> > entry (5 days)
> > 2c) Middlebox D sees DATA/SACK, and DROPS packets until HEARTBEAT is
> > seen to setup "secondary" conntrack entry (210 seconds)
> >
> > >
> > > Thanks.





[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux