Thank you for taking the time to produce this extremely thorough review.
Pls see [BB] inline;
You will need an HTML email reader for the diffs in this email.
Alternatively, I've temporarily uploaded a side-by-side diff here:
https://bobbriscoe.net/tmp/draft-ietf-tsvwg-ecn-l4s-id-28a-DIFF-27.html
On 30/07/2022 00:51, Bernard Aboba via
Datatracker wrote:
Reviewer: Bernard Aboba Review result: On the Right Track Here are my review comments. I believe this is quite an important document, so that making the reasoning as clear as possible is important. Unfortunately, the writing and overall organization makes the document hard to follow. If the authors are open to it, I'd be willing to invest more time to help get it into shape.
[BB] Thank you. You have already obviously sunk considerable time into it. Often I've found that your proposed alternative text didn't quite mean what we intended. But I've taken this as a sign that we hadn't explained it well and tried to guess what made you stumble.
This draft is in the long tail of many statistics: number of years since first draft, number of revisions, number of pages, etc. etc.
So I hope you will understand that this document has been knocked into all sorts of different shapes already, during a huge amount of WG review and consensus building, which I have tried not to upset, while also trying to understand why you felt it needed further changes.
Overall Comments Abstract Since this is an Experimental document, I was expecting the Abstract and perhaps the Introduction to refer briefly to the considerations covered in Section 7, (such as potential experiments and open issues).
[BB] Good point - I'm surprised no-one has brought this up before - thanks. I'll add the following:
Abstract:
...to prevent it degrading the low queuing delay and low loss of L4S traffic. This experimental track specification defines the rules that L4S transports and network elements need to follow with the intention that L4S flows neither harm each other's performance nor that of Classic traffic. It also suggests open questions to be investigated during experimentation. Examples of new ...
Intro:
There wasn't really a relevant point to mention the Experiments section (§7) until the document roadmap (which you ask for later).
So we added a brief summary of the "L4S Experiments" there (see later for the actual text). The only change to the Intro was the first line:
This experimental track specification...
Organization and inter-relation between Sections The document has organizational issues which make it more difficult to read. I think that Section 1 should provide an overview of the specification, helping the reader navigate it.
[BB] Section 3 already provides the basis of a roadmap to both this and other documents. It points to §4 (Transports) & §5 (Network nodes).
It ought to have also referred to §6 (Tunnels and Encapsulations), which was added to the draft fairly recently (but without updating this roadmap). We can and should add that.
We could even move §3 to be the last subsection of §1 (i.e. §1.4). Then it could start the roadmap with §2, which gives the requirements for L4S packet identification.
However, a number of other documents already refer to the Prague L4S Requirements in §4, particularly §4.3. I mean not just I-Ds (which can still be changed), but also papers that have already been published. So a pragmatic compromise would be to just switch round sections 2 (requirements) & 3 (roadmap).
Then we could retitle §3 to "L4S Packet Identification: Document Roadmap"
and add brief mentions of the tail sections (§7 L4S Experiments, and the usual IANA and Security Considerations).
The result is below, with manually added diff colouring (given we'd moved the whole section as well, so it's not a totally precise diff).
2. L4S Packet Identification: Document Roadmap The L4S treatment is an experimental track alternative packet marking treatment to the Classic ECN treatment in [RFC3168], which has been updated by [RFC8311] to allow experiments such as the one defined in the present specification. [RFC4774] discusses some of the issues and evaluation criteria when defining alternative ECN semantics, which are further discussed in Section 4.3.1. The L4S architecture [I-D.ietf-tsvwg-l4s-arch] describes the three main components of L4S: the sending host behaviour, the marking behaviour in the network and the L4S ECN protocol that identifies L4S packets as they flow between the two. The next section of the present document (Section 3) records the requirements that informed the choice of L4S identifier. Then subsequent sections specify the L4S ECN protocol, which i) identifies packets that have been sent from hosts that are expected to comply with a broad type of sending behaviour; and ii) identifies the marking treatment that network nodes are expected to apply to L4S packets. For a packet to receive L4S treatment as it is forwarded, the sender sets the ECN field in the IP header to the ECT(1) codepoint. See Section 4 for full transport layer behaviour requirements, including feedback and congestion response. A network node that implements the L4S service always classifies arriving ECT(1) packets for L4S treatment and by default classifies CE packets for L4S treatment unless the heuristics described in Section 5.3 are employed. See Section 5 for full network element behaviour requirements, including classification, ECN-marking and interaction of the L4S identifier with other identifiers and per-hop behaviours. L4S ECN works with ECN tunnelling and encapsulation behaviour as is, except there is one known case where careful attention to configuration is required, which is detailed in Section 6. L4S ECN is currently on the experimental track. So Section 7 collects together the general questions and issues that remain open for investigation during L4S experimentation. Open issues or questions specific to particular components are called out in the specifications of each component part, such as the DualQ [I-D.ietf-tsvwg-aqm-dualq-coupled]. The IANA assignment of the L4S identifier is specified in Section 8. And Section 9 covers security considerations specific to the L4S identifier. System security aspects, such as policing and privacy, are covered in the L4S architecture [I-D.ietf-tsvwg-l4s-arch].
Section 1.1 refers to definitions in Section 1.2 so I'd suggest that that Section 1.2 might be come first.
[BB] The reason for the Problem Statement being the first subsection was because that's what motivates people to read on.
Your suggestion has been made by others in the past, and the solution was to informally explain new terms in the sections before the formal terminology section, as they arose.
The formal terminology section can be considered as the end of the Introductory material and the start of the formal body of the spec.
If there are phrases that are not clearly explained before the terminology section, pls do point them out.
We can reconsider moving the terminology section to 1.1 if there are a lot.
But we'd rather the reader could continue straight into the summary of the problem and that it is understandable stand-alone - without relying on formal definitions elsewhere.
Section 1.3 provides basic information on Scope and the relationship of this document to other documents. I was therefore expecting Section 7 to include questions on some of the related documents (e.g. how L4S might be tested along with RTP).
[BB] That isn't the role of this document, which would be too abstract (or too long) if it had to cover how to test each different type of congestion control and each type of AQM.
Quoting from §7:
The specification of each scalable congestion control will need to include protocol-specific requirements for configuration and monitoring performance during experiments. Appendix A of the guidelines in [RFC5706] provides a helpful checklist.
Over the last 3 months, everyone involved in interop testing has been defining all the test plans, which had their first test-drive last week indeed, the success of the planning and organization of the tests surprised us all - kudos to Greg White who was largely responsible for coordinating it.
We may end up writing that all up as a separate draft. If many tests were documented centrally like this, each CC or AQM might only need to identify any special-case tests specific to itself.
That might even cover testing with live traffic over the Internet as well. But let's walk before we run.
I wonder whether much of Section 2 could be combined with Appendix B, with the remainder moved into the Introduction, which might also refer to Appendix B.
[BB] What is the problem that you are trying to solve by breaking up this section?
If we split up this section, someone else will want parts moved back, or something else moved. Unless there's a major problem with this section, we'd rather it stayed in one piece. Its main purpose is to record the requirements and to say (paraphrasing), "The outcome is a compromise between requirements 'cos header space is limited. Other solutions were considered, but this one was the least worst."
Summary: no action here yet, pending motivating reasoning from your side.
Section 4.2 RTP over UDP: A prerequisite for scalable congestion control is for both (all) ends of one media-level hop to signal ECN support [RFC6679] and use the new generic RTCP feedback format of [RFC8888]. The presence of ECT(1) implies that both (all) ends of that media-level hop support ECN. However, the converse does not apply. So each end of a media-level hop can independently choose not to use a scalable congestion control, even if both ends support ECN. [BA] The document earlier refers to an L4S modified version of SCreAM, but does not provide a reference. Since RFC 8888 is not deployed today, this paragraph (and Section 7) leaves me somewhat unclear on the plan to evaluate L4S impact on RTP. Or is the focus on experimentation with RTP over QUIC (e.g. draft-ietf-avtcore-rtp-over-quic)?
[BB] Ingemar has given this reply:
[IJ] RFC8298 (SCReAM) in its current version does not describe support for L4S. The open source running code on github does however support L4S. An update of RFC8298 has lagged behind but I hope to start with an RFC8298-bis after the vacation.
RFC8888 is implemented in the public available code for SCReAM (https://github.com/EricssonResearch/scream). This code has been extensively used in demos of 5G Radio Access Networks with L4S capability. The example demos have been cloud gaming and video streaming for remote controlled cars.
The code includes gstreamer plugins as well as multi-camera code tailored for NVidia Jetson Nano/Xavier NX (that can be easily modified for other platforms).
[BB] As an interim reference, Ingemar's README is already cited as [SCReAM-L4S]. it is a brief but decent document about the L4S variant of SCReAM, which also gives further references (and the open source code is its own spec).
Summary: The RFC 8888 part of this question seems to be about plans for how the software for another RFC is expected to be installed or bundled.
Is this a question that you want this draft to answer?
For instance, for DCTCP [RFC8257], TCP Prague [I-D.briscoe-iccrg-prague-congestion-control], [PragueLinux] and the L4S variant of SCReAM [RFC8298], the average recovery time is always half a round trip (or half a reference round trip), whatever the flow rate. [BA] I'm not sure that an L4S variant of SCReAM could really be considered "scalable" where simulcast or scalable video coding was being sent. In these scenarios, adding a layer causes a multiplicative increase in bandwidth, so that "probing" (e.g. stuffing the channel with RTX probes or FEC) is often a necessary precursor to make it possible to determine whether adding layers is actually feasible.
[BB] Ingemar has given this reply:
[IJ] The experiments run so far with SCReAM have been with the NVENC encoder, which supports rate changes on a frame by frame basis, and Jetson Nano/Xavier NX/Xavier AGX that is a bit more slow in its rate control loop. So the actual probing is done by adjusting the target bitrate of the video encoder.
[BB] Since last week (in the first L4S interop), we now have 2 other implementations of real-time video with L4S support directly over UDP (from NVIDIA and Nokia); in addition to the original 2015 demo (also from Nokia). You'd have to ask Ermin Sakic <esakic@xxxxxxxxxx> about the NVIDIA coding, and similarly Koen De Schepper <koen.de_schepper@xxxxxxxxx> about the Nokia ones. I do know that both Nokia ones change rate packet-by-packet (and if channel conditions are poor, the new one can even reduce down to 500kb/s while still preserving the same low latency).
The message here is that, for low latency video, you can't just use any old encoding that was designed without latency in mind.
Again, is this a question that you want this draft to answer? It seems like something that would be discussed in the spec of each r-t CC technique.
As with all transport behaviours, a detailed specification (probably an experimental RFC) is expected for each congestion control, following the guidelines for specifying new congestion control algorithms in [RFC5033]. In addition it is expected to document these L4S-specific matters, specifically the timescale over which the proportionality is averaged, and control of burstiness. The recovery time requirement above is worded as a 'SHOULD' rather than a 'MUST' to allow reasonable flexibility for such implementations. [BA] Is the L4S variant of SCReaM one of the detailed specifications that is going to be needed? From the text I wasn't sure whether this was documented work-in-progress or a future work item.
[BB] We cannot force implementers to write open specifications of their algorithms. Implementers might have secrecy constraints, or just not choose to invest the time in spec writing. So there is no hit-list of specs that 'MUST' be written, except we consider it proper to document the reference implementation of the Prague CC.
Nonetheless, others also consider it proper to document their algorithm (e.g. BBRv2), and in the case of SCReAM, Ingemar has promised he will (as quoted above).
We don't (yet?) have a description of the latest two implementations that the draft can refer to (they only announced these on the first day of the interop last week).
We try to keep a living web page up to date that points to current implementations ( https://l4s.net/#code ). However, I don't think the RFC Editor would accept this as an archival reference.
Section 4.3.1 To summarize, the coexistence problem is confined to cases of imperfect flow isolation in an FQ, or in potential cases where a Classic ECN AQM has been deployed in a shared queue (see the L4S operational guidance [I-D.ietf-tsvwg-l4sops] for further details including recent surveys attempting to quantify prevalence). Further, if one of these cases does occur, the coexistence problem does not arise unless sources of Classic and L4S flows are simultaneously sharing the same bottleneck queue (e.g. different applications in the same household) and flows of each type have to be large enough to coincide for long enough for any throughput imbalance to have developed. [BA] This seems to me to be one of the key questions that could limit the "incremental deployment benefit". A reference to the discussion in Section 7 might be appropriate here.
[BB] OK. At the end of the above para I've added:
Therefore, how often the coexistence
problem arises in practice is listed in Section 7 as an open
question that L4S experiments will need to answer.
5.4.1.1.1. 'Safe' Unresponsive Traffic The above section requires unresponsive traffic to be 'safe' to mix with L4S traffic. Ideally this means that the sender never sends any sequence of packets at a rate that exceeds the available capacity of the bottleneck link. However, typically an unresponsive transport does not even know the bottleneck capacity of the path, let alone its available capacity. Nonetheless, an application can be considered safe enough if it paces packets out (not necessarily completely regularly) such that its maximum instantaneous rate from packet to packet stays well below a typical broadband access rate. [BA] The problem with video traffic is that the encoder typically targets an "average bitrate" resulting in a keyframe with a bitrate that is above the bottleneck bandwidth and delta frames that are below it. Since the "average rate" may not be resettable before sending another keyframe, video has limited ability to respond to congestion other than perhaps by dropping simulcast and SVC layers. Does this mean that a video is "Unsafe Unresponsive Traffic"?
[BB] This section on 'Safe' Unresponsive traffic is about traffic that is so low rate that it doesn't need to use ECN to respond to congestion at all (e.g. DNS, NTP). Video definitely does not fall into that category.
I think your question is really asking whether video even /with/ ECN support can be considered responsive enough to maintain low latency. For this you ought to try to see the demonstration that Nokia did last week (if a recording is put online) or the Ericsson demonstration which is already online [EDT-5GLL]. Both over emulated 5G radio access networks with variability of channel conditions, and both showed very fast interaction within the video with no perceivable lag to the human eye. With the Nokia one last week, using finger gestures sent over the radio network, you could control the viewport into a video from a 360⁰ camera, which was calculated and generated at the remote end. No matter how fast you shook your finger around, the viewport stayed locked onto it.
Regarding keyframes, for low latency video, these are generally spread across the packets carrying the other frames.
[EDT-5GLL] Ericsson and DT demo 5G low latency feature: https://www.ericsson.com/en/news/2021/10/dt-and-ericsson-successfully-test-new-5g-low-latency-feature-for-time-critical-applications
I detect here that this also isn't a question about the draft - more a question of "I need to see it to believe it"?
NITs Abstract The L4S identifier defined in this document distinguishes L4S from 'Classic' (e.g. TCP-Reno-friendly) traffic. It gives an incremental migration path so that suitably modified network bottlenecks can distinguish and isolate existing traffic that still follows the Classic behaviour, to prevent it degrading the low queuing delay and low loss of L4S traffic. This specification defines the rules that [BA] Might be clear to say "This allows suitably modified network..."
[BB] I'm not sure what the problem is. But I'm assuming you're saying you tripped over the word 'gives'. How about simplifying:
It gives an incremental migration path so that suitably modifiedThen, network bottlenecks can be incrementally modified to distinguish and isolate existing traffic that still follows the Classic behaviour, to prevent it degrading the low queuing delay and low loss of L4S traffic.
The words "incremental migration path" suggest that there deployment of L4S-capable network devices and endpoints provides incremental benefit. That is, once new network devices are put in place (e.g. by replacing a last-mile router), devices that are upgraded to support L4S will see benefits, even if other legacy devices are not ugpraded. If this is the point you are looking to make, you might want to clarify the language.
[BB] I hope the above diff helps. Is that enough for an abstract, which has to be kept very brief?
Especially as all the discussion about incremental deployment is in the L4S architecture doc, so it wouldn't be appropriate to make deployment a big thing in the abstract of this draft.
Nonetheless, we can flesh out the text where incremental deployment is already mentioned in the intro (see our suggested text for your later point about this, below).
Summary: We propose only the above diff on these points about "incremental migration" in the abstract.
L4S transports and network elements need to follow with the intention that L4S flows neither harm each other's performance nor that of Classic traffic. Examples of new active queue management (AQM) marking algorithms and examples of new transports (whether TCP-like or real-time) are specified separately. [BA] Don't understand "need to follow with the intention". Is this stating a design principle, or is does it represent deployment guidance?
[BB] I think a missing comma is the culprit. Sorry for confusion. It should be:
This specification defines the rules that L4S transports and network elements need to follow, with the intention that L4S flows neither harm each other's performance nor that of Classic traffic.
The sentence "L4S flows neither harm each other's performance nor that of classic traffic" might be better placed after the first sentence in the second paragraph, since it relates in part to the "incremental deployment benefit" argument.
[BB] That wouldn't be appropriate, because:
* To prevent "Classic harms L4S" an L4S AQM needs the L4S identifier on packets to isolate them
* To prevent "L4S harms Classic" needs the L4S sender to detect that it's causing harm which is sender behaviour (rules), not identifier-based.
So the sentence has to come after the point about "the spec defines the rules".
Summary: we propose no action on this point.
Section 1. Introduction This specification defines the protocol to be used for a new network service called low latency, low loss and scalable throughput (L4S). L4S uses an Explicit Congestion Notification (ECN) scheme at the IP layer with the same set of codepoint transitions as the original (or 'Classic') Explicit Congestion Notification (ECN [RFC3168]). RFC 3168 required an ECN mark to be equivalent to a drop, both when applied in the network and when responded to by a transport. Unlike Classic ECN marking, the network applies L4S marking more immediately and more aggressively than drop, and the transport response to each [BA] Not sure what "aggressively" means here. In general, marking traffic seems like a less aggressive action than dropping it. Do you mean "more frequently"?
[BB] OK; 'frequently' it is.
(FWIW, I recall that the transport response used to be described as more aggressive (because it reduces less in response to each mark), and the idea was that using aggressive for both would segue nicely into the next sentence about the two counterbalancing. Someone asked for that to be changed, and now the last vestiges of that failed literary device are cast onto the cutting room floor. The moral of this tale: never try to write a literary masterpiece by committee ;)
Also, it's a bit of a run-on sentence, so I'd break it up: "than drop. The transport response to each" mark is reduced and smoothed relative to that for drop. The two changes counterbalance each other so that the throughput of an L4S flow will be roughly the same as a comparable non-L4S flow under the same conditions.
[BB] Not sure about this - by the next sentence (about the two changes), the reader has lost track of them. How about using numbering to structure the long sentence:
Unlike Classic ECN marking: i) the network applies L4S marking more immediately and more aggressively than drop; and ii) the transport response to each mark is reduced and smoothed relative to that for drop. The two changes counterbalance each other...OK?
Nonetheless, the much more frequent ECN control signals and the finer responses to these signals result in very low queuing delay without compromising link utilization, and this low delay can be maintained during high load. For instance, queuing delay under heavy and highly varying load with the example DCTCP/ DualQ solution cited below on a DSL or Ethernet link is sub- millisecond on average and roughly 1 to 2 milliseconds at the 99th percentile without losing link utilization [DualPI2Linux], [DCttH19]. [BA] I'd delete "cited below" since you provide the citation at the end of the sentence.
[BB] 'Cited below' referred to the DCTCP and DualQ citations in the subsequent para, because this is the first time either term has been mentioned.
'Described below'
was what was really meant. I think that makes it clear enough (?).
Note that the inherent queuing delay while waiting to acquire a discontinuous medium such as WiFi has to be minimized in its own right, so it would be additional to the above (see section 6.3 of the L4S architecture [I-D.ietf-tsvwg-l4s-arch]). [BA] Not sure what "discontinuous medium" means. Do you mean wireless? Also "WiFi" is a colloquialism; the actual standard is IEEE 802.11 (WiFi Alliance is an industry organization). Might reword this as follows: "Note that the changes proposed here do not lessen delays from accessing the medium (such as is experienced in [IEEE-802.11]). For discussion, see Section 6.3 of the L4S architecture [I-D.ietf-tsvwg-l4s-arch]."
[BB] We've used 'shared' instead. Other examples of shared media are LTE, 5G, DOCSIS (cable), DVB (satellite), PON (passive optical network). So I've just said 'wireless' rather than give a gratuitous citation of 802.11.
Note that theinherentqueuing delay while waiting to acquire adiscontinuousshared medium such asWiFiwireless has to beminimized in its own right, so it would be additionaladded to theaboveabove. It is a different issue that needs to be addressed, but separately (see section 6.3 of the L4S architecture [I-D.ietf-tsvwg-l4s-arch]).
Then, because wireless is less specific, I've taken out 'inherent' because strictly medium acquisition delay is not inherent to a medium - it depends on the multiplexing scheme. For instance radio networks can use CDM (code division multiplexing), and they did in 3G.
'Inherent' was trying to get over the sense that this delay is not amenable to reduction by congestion control. Rather than try to cram all those concepts into one sentence, I've split it.
OK?
L4S is not only for elastic (TCP-like) traffic - there are scalable congestion controls for real-time media, such as the L4S variant of the SCReAM [RFC8298] real-time media congestion avoidance technique (RMCAT). The factor that distinguishes L4S from Classic traffic is [BA] Is there a document that defines the L4S variant of SCReAM?
[BB] I've retagged Ingemar's readme as [SCReAM-L4S], and included it here to match the other two occurrences of SCReAM:
such as the L4S variant [SCReAM-L4S] of the SCReAM [RFC8298] real-time media congestion avoidance technique (RMCAT).
It sounds like Ingemar plans to update RFC8298 with a bis, so I guess eventually [RFC8298] should automatically become a reference to its own update.
its behaviour in response to congestion. The transport wire protocol, e.g. TCP, QUIC, SCTP, DCCP, RTP/RTCP, is orthogonal (and therefore not suitable for distinguishing L4S from Classic packets). The L4S identifier defined in this document is the key piece that distinguishes L4S from 'Classic' (e.g. Reno-friendly) traffic. It gives an incremental migration path so that suitably modified network bottlenecks can distinguish and isolate existing Classic traffic from L4S traffic to prevent the former from degrading the very low delay and loss of the new scalable transports, without harming Classic performance at these bottlenecks. Initial implementation of the separate parts of the system has been motivated by the performance benefits. [BA] I think you are making an "incremental benefit" argument here, but it might be made more explicit: " The L4S identifier defined in this document distinguishes L4S from 'Classic' (e.g. Reno-friendly) traffic. This allows suitably modified network bottlenecks to distinguish and isolate existing Classic traffic from L4S traffic, preventing the former from degrading the very low delay and loss of the new scalable transports, without harming Classic performance. As a result, deployment of L4S in network bottlenecks provides incremental benefits to endpoints whose transports support L4S."
[BB] We don't really want to lose the point about the identifier being key. So I've kept that. And for the middle sentence, I've used the simpler construction developed above (for the similar wording in the abstract).
Regarding the last sentence, no, it meant more than that. It meant that, even though implementer's customers get no benefit until both parts are deployed, for some implementers the 'size of the potential prize' has already been great enough to warrant investment in implementing their part, without any guarantee that other parts will be implemented. However, we need to be careful not to stray into conjecture and predictions, particularly not commercial ones, which is why this sentence was written in the past tense. Pulling this all together, how about:
The L4S identifier defined in this document is the key piece that distinguishes L4S from 'Classic' (e.g. Reno-friendly) traffic.I considered adding "have already been motivated..." or "at the time of writing, initial implementations..." but decided against both - they sounded a bit hyped up.It gives an incremental migration path so that suitably modifiedThen, network bottlenecks can be incrementally modified to distinguish and isolate existing Classic traffic from L4S traffic, to prevent the former from degrading the very low queuing delay and loss of the new scalable transports, without harming Classic performance at these bottlenecks. Although both sender and network deployment are required before any benefit, initial implementations of the separate parts of the system have been motivated by the potential performance benefits.
What do you think?
Section 1.1 1.1. Latency, Loss and Scaling Problems Latency is becoming the critical performance factor for many (most?) applications on the public Internet, e.g. interactive Web, Web services, voice, conversational video, interactive video, interactive remote presence, instant messaging, online gaming, remote desktop, cloud-based applications, and video-assisted remote control of machinery and industrial processes. In the 'developed' world, further increases in access network bit-rate offer diminishing returns, whereas latency is still a multi-faceted problem. In the last decade or so, much has been done to reduce propagation time by placing caches or servers closer to users. However, queuing remains a major intermittent component of latency. [BA] Since this paragraph provides context for the work, you might consider placing it earlier (in Section 1 as well as potentially in the Abstract).
[BB] The L4S architecture Intro already starts like you suggest.
See https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-l4s-arch-19#section-1
The present doc starts out more as a technical spec might, with a 4-para intro focusing on what it says technically. Then it has a fairly long subsection to summarize the problem for those reading it stand-alone. That is intentional (so readers who have already read the architecture can easily jump).
Summary: We propose to leave the opening of the intro unchanged.
We've picked up most, but not all, of your suggestions:Might modify this as follows: " Latency is the critical performance factor for many Internet applications, including web services, voice, realtime video, remote presence, instant messaging, online gaming, remote desktop, cloud services, and remote control of machinery and industrial processes. In these applications, increases in access network bitrate may offer diminishing returns. As a result, much has been done to reduce delays by placing caches or servers closer to users. However, queuing remains a major contributor to latency."
Latency is becoming the critical performance factor for many (most?)applications on the public Internet,Internet applications, e.g. interactiveWeb, Webweb, web services, voice, conversational video, interactive video, interactive remote presence, instant messaging, online gaming, remote desktop, cloud-basedapplications,applications & services, andvideo-assistedremote control of machinery and industrial processes. In many parts of the'developed'world, further increases in access networkbit-ratebit rate offer diminishing returns [Dukkipati06], whereas latency is still a multi-faceted problem.In the last decade or so,As a result, much has been done to reduce propagation time by placing caches or servers closer to users. However, queuing remains amajor intermittentmajor, albeit intermittent, component of latency.
We've added [Dukkipati06], because we were asked to justify the similar 'diminishing returns' claim in the L4S architecture, and Dukkipati06 provides a plot supporting that in its intro:
[Dukkipati06] Dukkipati, N. and N. McKeown, "Why Flow-Completion Time is the Right Metric for Congestion Control", ACM CCR 36(1):59--62, January 2006, <https://dl.acm.org/doi/10.1145/1111322.1111336>.
The distinctions between different applications of the same technology were deliberately intended to distinguish different degrees of latency sensitivity, so we left some of them in.
OK?
The Diffserv architecture provides Expedited Forwarding [RFC3246], so that low latency traffic can jump the queue of other traffic. If growth in high-throughput latency-sensitive applications continues, periods with solely latency-sensitive traffic will become increasingly common on links where traffic aggregation is low. For instance, on the access links dedicated to individual sites (homes, small enterprises or mobile devices). These links also tend to become the path bottleneck under load. During these periods, if all the traffic were marked for the same treatment, at these bottlenecks Diffserv would make no difference. Instead, it becomes imperative to remove the underlying causes of any unnecessary delay. [BA] This paragraph is hard to follow. You might consider rewriting it as follows: "The Diffserv architecture provides Expedited Forwarding [RFC3246], to enable low latency traffic to jump the queue of other traffic. However, the latency-sensitive applications are growing in number along with the fraction of latency-sensitive traffic. On bottleneck links where traffic aggregation is low (such as links to homes, small enterprises or mobile devices), if all traffic is marked for the same treatment, Diffserv will not make a difference. Instead, it is necessary to remove unnecessary delay."
[BB] Your proposed replacement has the following problems:
* It relies on prediction (the previous text avoided prediction, instead saying "if growth ... continues");
* The proposed replacement loses the critical sense of "periods with solely latency sensitive traffic" (not all the time)
* it also loses the critical idea that the same links that are low stat mux tend to also be those where the bottleneck is.
How about:
The Diffserv architecture provides Expedited Forwarding [RFC3246], so that low latency traffic can jump the queue of other traffic. If growth inhigh-throughputlatency-sensitive applications continues, periods with solely latency-sensitive traffic will become increasingly common on links where traffic aggregation is low.For instance, on the access links dedicated to individual sites (homes, small enterprises or mobile devices). These links also tend to become the path bottleneck under load. During these periods, if allDuring these periods, if all the traffic were marked for the same treatment,at these bottlenecksDiffserv would make no difference.Instead,The links with low aggregation also tend to become the path bottleneck under load, for instance, the access links dedicated to individual sites (homes, small enterprises or mobile devices). So, instead of differentiation, it becomes imperative to remove the underlying causes of any unnecessary delay.
I tried to guess what you found hard to follow, but still to keep all the concepts. The main changes were:
* to switch the sentence order so "periods with solely" and "these periods" were not a few sentences apart.
* to make it clear what 'instead' meant.
Better?
long enough for the queue to fill the buffer, making every packet in other flows sharing the buffer sit through the queue. [BA] "sit through" -> "share"
[BB] Nah, that's tautology "other flows sharing the buffer share the queue".
And it loses the sense of waiting. If "sit through" isn't understandable, how about
"...causing every packet in other flows sharing the buffer to have to
work its way through the queue."
?
Active queue management (AQM) was originally developed to solve this problem (and others). Unlike Diffserv, which gives low latency to some traffic at the expense of others, AQM controls latency for _all_ traffic in a class. In general, AQM methods introduce an increasing level of discard from the buffer the longer the queue persists above a shallow threshold. This gives sufficient signals to capacity- seeking (aka. greedy) flows to keep the buffer empty for its intended purpose: absorbing bursts. However, RED [RFC2309] and other algorithms from the 1990s were sensitive to their configuration and hard to set correctly. So, this form of AQM was not widely deployed. More recent state-of-the-art AQM methods, e.g. FQ-CoDel [RFC8290], PIE [RFC8033], Adaptive RED [ARED01], are easier to configure, because they define the queuing threshold in time not bytes, so it is invariant for different link rates. However, no matter how good the AQM, the sawtoothing sending window of a Classic congestion control will either cause queuing delay to vary or cause the link to be underutilized. Even with a perfectly tuned AQM, the additional queuing delay will be of the same order as the underlying speed-of- light delay across the network, thereby roughly doubling the total round-trip time. [BA] Would suggest rewriting as follows: " More recent state-of-the-art AQM methods such as FQ-CoDel [RFC8290], PIE [RFC8033] and Adaptive RED [ARED01], are easier to configure, because they define the queuing threshold in time not bytes, providing link rate invariance. However, AQM does not change the "sawtooth" sending behavior of Classic congestion control algorithms, which alternates between varying queuing delay and link underutilization. Even with a perfectly tuned AQM, the additional queuing delay will be of the same order as the underlying speed-of-light delay across the network, thereby roughly doubling the total round-trip time."
[BB] We've taken most of these suggestions, but link rate invariance is rather a mouthful.
Also more queue delay or more under-utilization wasn't meant to imply alternating between the two.
So how about:
More recent state-of-the-art AQM methods,e.g.such as FQ-CoDel [RFC8290], PIE [RFC8033] or Adaptive RED [ARED01], are easier to configure, because they define the queuing threshold in time not bytes, soitconfiguration is invariantfor differentwhatever the linkrates.rate. However,no matter how good the AQM,the sawtoothingsendingwindow of a Classic congestion control creates a dilemma for the operator: i) either configure a shallow AQM operating point, so the tips of the sawteeth cause minimal queue delay but the troughs underutilize the link, or ii) configure the operating point deeper into the buffer, so the troughs utilize the link better but then the tips cause more delay variation. Even... OK?
If a sender's own behaviour is introducing queuing delay variation, no AQM in the network can 'un-vary' the delay without significantly compromising link utilization. Even flow-queuing (e.g. [RFC8290]), which isolates one flow from another, cannot isolate a flow from the delay variations it inflicts on itself. Therefore those applications that need to seek out high bandwidth but also need low latency will have to migrate to scalable congestion control. [BA] I'd suggest you delete the last sentence, since the point is elaborated on in more detail in the next paragraph.
[BB] Actually, this point is not made in the next para (but you might have thought it was because it's not clear, so below I've tried to fix it).
Indeed, I've realized we need to /add/ to the last sentence, because we haven't yet said what a scalable control is...
...migrate to scalable congestion control, which uses much smaller sawtooth variations.
Altering host behaviour is not enough on its own though. Even if hosts adopt low latency behaviour (scalable congestion controls), they need to be isolated from the behaviour of existing Classic congestion controls that induce large queue variations. L4S enables that migration by providing latency isolation in the network and [BA] "enables that migration" -> "motivates incremental deployment" distinguishing the two types of packets that need to be isolated: L4S and Classic. L4S isolation can be achieved with a queue per flow (e.g. [RFC8290]) but a DualQ [I-D.ietf-tsvwg-aqm-dualq-coupled] is sufficient, and actually gives better tail latency. Both approaches are addressed in this document.
[BB] The intended meaning here is 'enables' (technical feasibility), not motivates (human inclination).
But whatever, in the rewording below, I don't think either is needed. I'm also assuming that middle sentence didn't make sense for you, and I think I see why. So how about:
Altering host behaviour is not enough on its own though. Even if hosts adopt low latencybehaviour (scalable congestion controls), they need to be isolated from thebehaviour oflarge queue variations induced by existing Classic congestion controlsthat induce large queue variations.L4S enables thatL4S AQMs provide that latency isolation in the network andmigration by providingdistinguishingthe L4S identifier enables the AQMs to distinguish the two types of packetsthat need to be isolated: L4S and Classic.
How's that?
The DualQ solution was developed to make very low latency available without requiring per-flow queues at every bottleneck. This was [BA] "This was" -> "This was needed"
[BB] Not quite that strong. More like:
"This was useful"
Latency is not the only concern addressed by L4S: It was known when [BA] ":" -> "."
[BB] OK.
explanation is summarised without the maths in Section 4 of the L4S [BA] "summarised without the maths" -> "summarized without the mathematics"
[BB] OK - that nicely side-steps stumbles from either side of the Atlantic.
1.2. Terminology [BA] Since Section 1.1 refers to some of the Terminology defined in this section, I'd consider placing this section before that one.
[BB] See earlier for push-back on this.
Reno-friendly: The subset of Classic traffic that is friendly to the standard Reno congestion control defined for TCP in [RFC5681]. The TFRC spec. [RFC5348] indirectly implies that 'friendly' is [BA] "spec." -> "specification"
[BB] I checked this after a previous review comment, and 'spec' is now considered to be a word in its own right. I should have removed the full-stop though, which I did for all other occurrences.
However, the RFC Editor might have a style preference on this point, in which case I will acquiesce.
defined as "generally within a factor of two of the sending rate of a TCP flow under the same conditions". Reno-friendly is used here in place of 'TCP-friendly', given the latter has become imprecise, because the TCP protocol is now used with so many different congestion control behaviours, and Reno is used in non- [BA] "Reno is used" -> "Reno can be used"
[BB] OK
4. Transport Layer Behaviour (the 'Prague Requirements') [BA] This section is empty and there are no previous references to Prague. So I think you need to say a few words here to introduce the section.
[BB] OK. How about:
This section defines L4S behaviour at the transport layer, also known
as the Prague L4S Requirements (see Appendix A for the origin of the
name).
Again, thank you very much for all the time and effort you've put into this review.
Regards
Bob
-- ________________________________________________________________ Bob Briscoe http://bobbriscoe.net/
-- last-call mailing list last-call@xxxxxxxx https://www.ietf.org/mailman/listinfo/last-call