Re: [Last-Call] [Int-dir] Intdir telechat review of draft-ietf-masque-connect-ip-10

David Schinazi <dschinazi.ietf@xxxxxxxxx> · Wed, 19 Apr 2023 14:30:14 -0700

Hi Joe,

I think we're all in agreement that TCP-in-TCP leads to poor performance. And the document normatively recommends against doing that:
<< When the protocol running inside the tunnel uses loss recovery (e.g., [TCP] or [QUIC]), and the outer HTTP connection runs over TCP, the proxied traffic will incur at least two nested loss recovery mechanisms. This can reduce performance as both can sometimes independently retransmit the same data. To avoid this, IP proxying SHOULD be performed over HTTP/3 to allow leveraging the QUIC DATAGRAM frame. >>

When it comes to what the document calls "nested congestion control", the results are different however. While there is ample literature on TCP-in-TCP (which we call "nested loss recovery"), I don't think anyone has ever researched nested congestion control without nested loss recovery (nor nested flow control). Thinking about this more I think that the QUIC DATAGRAM frame is the first protocol to have nested congestion control without loss recovery nor flow control. That would explain why there isn't any research on this yet.

From your previous messages I'm wondering if you might not know all the subtleties of how QUIC DATAGRAM frames work. Let me attempt to reply in detail here:

> The lower one slams the window down due to loss; the upper one should never really see loss at all (given it’s running over TCP),

The first part here ("The lower one slams the window down due to loss") is true for the QUIC DATAGRAM frame, but the latter is not. QUIC DATAGRAM frames get dropped, so the upper TCP will see loss.

> but every time a loss and retransmit occurs, the RTT measurements at the upper layer take a hit.

Similarly, when a QUIC DATAGRAM frame is lost, it is not retransmitted so there is no delay introduced, just a loss.

Additionally, QUIC DATAGRAM frames are not subject to flow control.

Putting this all together: when the lower QUIC connection's congestion controller kicks in, it will start dropping QUIC DATAGRAM frames, which the upper TCP will see as loss. This is very similar to how a router's queue or a radio link will cause loss. It doesn't have any of the introduced delays or retransmissions that you describe.

So if what you're asking is "SHOULD NOT do TCP-in-TCP", then we're in agreement and that's already in the doc. But "nested congestion control" means something else.

David

On Wed, Apr 19, 2023 at 1:58 PM Christian Huitema <huitema@xxxxxxxxxxx> wrote:

On 4/19/2023 11:34 AM, touch@xxxxxxxxxxxxxx wrote:

> Hi, Christian,

> 

> For TCP on TCP:

> 

> The bottom-most TCP turns an assumed fixed-capacity link with variable loss into a lossless link with variable RTT *and* it is trying to adapt to any changes in RTT and capacity.

> 

> The TCP on top then ends up as a badly oscillating flow control mechanism.

> 

> When the link transmission capacity varies; that’s a different change with presumably a relatively fixed RTT.

Many links do not have a fixed capacity from point A to point B. This is 

not new. TCP used to work over yellow cable Ethernet or X.25 virtual 

circuits, and in both cases the capacity of the "link" varied with the 

competing load on other Ethernet connections or other virtual circuits. 

Congestion control algorithm like Reno, Cubic or BBR are all designed to 

adapt to variable capacity.

The loss detection algorithms are also designed to deal with variable 

end-to-end delays. Some algorithms such as Ledbat or BBR do depend on an 

estimate of the min RTT, but that's not affected here.

> They’re not the same; TCP and other algs try to deal with one level of this, but none (AFAIK) are intended to be recursively (TCP on TCP) or mutually (TCP on QUIC or QUIC on TCP) stacked and end up stable.

Maybe. If the "link" TCP is a small part of the path, I would be 

surprised to see big issues, but if the two TCP connections are exactly 

superposed, there might be some issues. That probably depends a lot on 

deployment condition, such as which congestion control algorithm is used 

at each layer. I would be interested to read the papers that describe 

the issue.

Concretely, this seems an issue not so much with the "link" itself as 

with the management of the input queue. As far as Masque is concerned, I 

think the proper way to address that is to document the issue, cite the 

papers that you refer to, and recommend that deployments place adequate 

queue management algorithms in front of the tunnel. Do we have an AQM 

best practice RFC that could be cited?

-- Christian Huitema

-- 
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call