The smaller issues that I have specific comments about are listed below.
Minor/editorial/nit issues:
Abstract
This document serves as a guide to implementation on how to
build such a system.
The phrases "implementation" and "how to build" seem redundant.
Perhaps "guide to implementing such a system"? (Similarly, sec. 1,
para. 2, sent. 1.)
1. Introduction
I expect the text of the introduction to be clearer on the normative
status of this text. I would expect the implementation document to
have few normative statements beyond what are copied from the API
definition. But it's possible to have details like the specifics of
candidate choice have SHOULD statements that aren't in the API
definition, as those behaviors aren't directly visible in the API
behavior and may be heavily affected by a particular implementation
environment.
2. Implementing Connection Objects
The properties held by a
Connection or Listener are independent of other connections that are
not part of the same Connection Group.
First, "connections" should be "Connections".
Indeed, it appears that the properties are independent of connections
in the same Connection Group, too, as properties are inherited from
the Preconnection and then fixed thereafter.
But I think my issue is what is meant by "properties". I suspect that
the API model has two sorts of properties, one sort are the ones
inherited from the Preconnection and one that are set by the process
of connecting (e.g., the local ephemeral port), and another sort that
are not from the Preconnection but are shared among Connections in a
Connection Group. If so, this distinction needs to be made explicit.
But it's still not clear to me how these connection properties can be
changed once as Connection is established, even by another Connection
in a Connection Group.
Once Initiate has been called, the Selection Properties and Endpoint
information are immutable (i.e, an application is not able to later
modify Selection Properties on the original Preconnection object).
Listener objects are created with a Preconnection, at which point
their configuration should be considered immutable by the
implementation.
I had some difficulty reading this paragraph. After thinking about
it, it seems that all of it is about listening, but that is not made
clear at the beginning, and I assumed it was continuing the discussion
of connection establishment in the previous paragraph.
Within that context, sec. 2 states that when a Listener is created,
all applicable properties are copied from the Precondition, and no
later changes to the Precondition change the Listener. But the first
sentence says that the Precondition object cannot be changed, which
makes no sense.
3. Implementing Pre-Establishment
It appears that pre-establishment is the creating/updating of a
Preconnection object (and that Endpoints, Connection Properties, and
Capacity Profile are data within the Preconnection object), but this
should be stated directly and as early as possible in this section.
3.1. Configuration-time errors
3.2. Role of system policy
Lastly, the implementation itself may default to disallowing certain
network interfaces unless explicitly requested by the application and
allowed by the system.
I believe the point being made by this sentence would be clearer if
"and allowed by the system" was omitted. The meaning that phrase adds
is included in the meanings of the previous sentences. Omitting that
phrase emphasizes the meaning that policy may require the application
to specifically request certain resources in order to obtain them (as
opposed to them being available whenever they are not explicitly
forbidden) -- which is a pattern that is not all that common in APIs
and so should be stated explicitly.
In regard to normativity, you probably want "MAY". And I think it's
reasonable to have this statement only in the implementation document,
and not in the API definition, as the process of determining what
resources are allocated to a particular application request
"inherently" involves details of the implementation and environment
that an API definition cannot specify exactly.
An
implementation should attempt to look up the relevant policies for
the system in a dynamic way to make sure it is reflecting an accurate
version of the system policy, since the system's policy regarding the
application's traffic may change over time due to user or
administrative changes.
Two questions arise: (1) Once a Connection is created, its transport
properties likely cannot be altered. Does that mean that the last
moment at which policy can be applied to a Connection is when it is
created? (2) This statement conflicts with "To avoid allocating
resources that are not finally needed, it is important that
configuration-time errors fail as early as possible." because if
policy is looked up dynamically, it's possible to create a
Preconnection that conflicts with policy at the time it is created,
but the policy then changes, and after that a Connection can be
created using the Preconnection without conflicting with policy.
Obviously, there is no simple, definitive solution to these problems,
but this is probably a good place in the document to point out the
complexities and the API features that enable an application to deal
with them. E.g. I would expect that an application can request a
Preconnection that conflicts with the current policy, and by default,
that returns an error, so that simple/naive applications see the
problem and abort as early as possible. But if the application
explicitly ignores the error, it can obtain such a Preconnection and,
under the expectation that it will become compatible with policy,
attempt to use it later.
Similarly, when creating a Connection, there is a possible error "the
Preconnection conflicts with policy right now, even though it did not
when it was created". And there needs to be some sort of "policy has
changed and the Connection is now violating policy" error.
These error returns and how an application differentiates them from
other errors should be specified in the API definition.
4. Implementing Connection Establishment
For ease of illustration, this document structures the candidates for
racing as a tree (see Section 4.1). This is not meant to restrict
implementations from structuring racing candidates differently.
Any one of these sub-entries on the aggregate connection attempt
would satisfy the original application intent. The concern of this
section is the algorithm defining which of these options to try,
when, and in what order.
During Candidate Gathering (Section 4.2), an implementation prunes
and sorts branches according to the Selection Property preferences
(Section 6.2 of [I-D.ietf-taps-interface]. It first excludes all
protocols and paths that match a Prohibit property or do not match
all Require properties. Then it will sort branches according to
Preferred properties, Avoided properties, and possibly other
criteria.
This section needs to clarify which statements are normative and which
are not. Naively, I would expect that the implementation is free to
implement connection selection in whatever way it wants, and this
section is just a suggestion. But "Then it will sort branches
according to Preferred properties, Avoided properties ..." makes it
clear that the API definition requires the implementation to take into
account some properties. In that context, what are the limitations on
the allowed selection algorithms? Particularly, if an algorithm does
not organize candidates as a tree, what is the import of "sort
branches according to Preferred properties, Avoided properties"?
Also, using SHOULD and MUST would clarify things.
4.1. Structuring Candidates as a Tree
The parent (or trunk) node of the tree will be represented by
a single integer, such as "1".
My understanding is that the usual term is "root node".
As noted above, the consideration of multiple candidates in a
gathering and racing process can be conceptually structured as a
tree; this terminological convention is used throughout this
document.
In protocol stacks, the layers are
separated by '/' and ordered top-down.
Given that the tree structure has an up-down dimension and this
sentence is not referring to that dimension, it might be clearer to
say that the designations of protocol stack layers are "ordered
top-layer-first" since this document writes them that way
(e.g. "HTTP/TCP").
A connection establishment tree may be degenerate, and only have a
single leaf node, such as a connection attempt to an IP address over
a single interface with a single protocol.
Given that "degenerate" is used nowhere else, this could be simplified
to "A connection establishment tree may consist of only a single leaf
node, such as a connection attempt to a specified IP address over a
single interface with a single protocol."
4.1.2. Branching Order-of-Operations
For example, if the application has indicated both a preference for
WiFi over LTE and for a feature only available in SCTP, branches will
be first sorted accord to path selection, with WiFi at the top.
Then, branches with SCTP will be sorted to the top within their
subtree according to the properties influencing protocol selection.
I find the use of "top" confusing here. Given we are talking about
trees, I consider "top" to mean closer to the root of the tree. Here,
it appears to mean "earlier in the set of children of a parent node".
I think "first" would be better, as we are arranging the children
under a parent node in what is usually shows as left-to-right order
(or actually, into the time-order in which the children will be
tried).
Note this trouble comes from the compact notation used for trees,
which causes the typographical top-bottom dimension to be used both
for the top-bottom axis of the layers of the tree and for the
sequencing of the children of a mode. Make sure people aren't
confused if they continue to think about trees in the usual graphical
presentation.
4.1.3. Sorting Branches
Implementations should sort the branches of the tree of connection
options in order of their preference rank, from most preferred to
least preferred. Leaf nodes on branches with higher rankings
represent connection attempts that will be raced first.
Implementations should order the branches to reflect the preferences
expressed by the application for its new connection, including
Selection Properties, which are specified in
[I-D.ietf-taps-interface].
The first and third sentences of this paragraph are largely the same;
can they be combined?
4.3. Candidate Racing
However, an implementation is unable to know the full tree
before it is formed [...]
This is probably not phrased well. Strictly, it is a tautology, you
can't know the full tree until you know it. I suspect the meaning is
that an implementation may want to start racing candidates before the
full tree is known.
Any timer or racing logic is isolated to a
given parent node, and is not ordered precisely with regards to other
children of other nodes.
"other children of other nodes" s.b. "children of other nodes".
4.3.3. Failover
An example in which failover is recommended is a race between a
Protocol Stack that uses a proxy and a Protocol Stack that bypasses
the proxy. Failover is useful in case the proxy is down or
misconfigured, but any more aggressive type of racing may end up
unnecessarily avoiding a proxy that was preferred by policy.
This could be clarified. I started reading the paragraph assuming
that a connection without a proxy would always be preferred to one
with a proxy. However, in this example the opposite is true, and that
should be revealed at the beginning. Perhaps
An example in which failover is recommended is a race where a
Protocol Stack that uses a proxy is preferred to a Protocol Stack
that bypasses the proxy. Failover is useful in case the proxy is
down or misconfigured, but any more aggressive type of racing may
end up avoiding the proxy when it could have been used.
4.4.1. Determining Successful Establishment
If the only protocol being used is a transport protocol
with a clear handshake, like TCP, then the obvious choice is to
declare that node "connected" when the last packet of the three-way
handshake has been received.
We are discussing behavior of the client and the client does not know
when the last (ACK) packet of the three-way handshake has been
received because it does not receive that packet. You want to say
"... has been transmitted." since transmitting that packet is the last
thing the client does during TCP connection establishment.
4.5. Establishing multiplexed connections
Multiplexing several Connections over a single underlying transport
connection requires that the Connections to be multiplexed belong to
the same Connection Group (as is indicated by the application using
the Clone call). When the underlying transport connection supports
multi-streaming, the Transport Services System can map each
Connection in the Connection Group to a different stream. Thus, when
the Connections that are offered to an application by the Transport
Services API are multiplexed, the Transport Services implementation
can establish a new Connection by simply beginning to use a new
stream of an already established transport Connection and there is no
need for a connection establishment procedure. This, then, also
means that there may not be any "establishment" message (like a TCP
SYN), but the application can simply start sending or receiving.
Therefore, when the Initiate action of a Transport Services API is
called without Messages being handed over, it cannot be guaranteed
that the Remote Endpoint will have any way to know about this, and
hence a passive endpoint's ConnectionReceived event might not be
delivered until data is received. Instead, delivering the
ConnectionReceived event could be delayed until the first Message
arrives.
This should be clarified, I think. The first part is fine:
Multiplexing several Connections over a single underlying transport
connection requires that the Connections to be multiplexed belong to
the same Connection Group (as is indicated by the application using
the Clone call). When the underlying transport connection supports
multi-streaming, the Transport Services System can map each
Connection in the Connection Group to a different stream.
The next part is not quite correct:
Thus, when
the Connections that are offered to an application by the Transport
Services API are multiplexed, the Transport Services implementation
can establish a new Connection by simply beginning to use a new
stream of an already established transport Connection and there is no
need for a connection establishment procedure. This, then, also
means that there may not be any "establishment" message (like a TCP
SYN), but the application can simply start sending or receiving.
What is really going on is that the protocol has two layers, a lower
one that creates the connection over which streams are multiplexed and
a higher one that is the stream which is multiplexed. What is being
described is when the upper layer has no explicit handshake to
establish a new stream. But that is just another case of the
situation with UDP discussed in sec. 4.4.1 para. 1. I think a better
phrasing would be as follows (and probably reads better as a separate
paragraph):
Thus, when the Connections that are offered to an application by
the Transport Services API are multiplexed, the Transport Services
implementation can establish a new Connection by using a new stream
of an already established transport Connection. Effectively, the
streams are an additional protocol layer on top of the transport
connection, and for many such there is no explicit connection
establishment procedure for the new stream prior to sending data on
it. In this case, the same considerations apply to determining
stream establishment as apply to establishing a UDP connection, as
discussed in section 4.4.1.
The final part seems to be correct:
Therefore, when the Initiate action of a Transport Services API is
called without Messages being handed over, it cannot be guaranteed
that the Remote Endpoint will have any way to know about this, and
hence a passive endpoint's ConnectionReceived event might not be
delivered until data is received. Instead, delivering the
ConnectionReceived event could be delayed until the first Message
arrives.
but (1) it could be clarified, and (2) it applies generally to
non-handshake protocols, and so probably should be relocated to
e.g. sec. 4.4.1 between paragraphs 1 and 2, where we first discuss the
nuances of handshake-less protocols. Perhaps these adjustments to the
wording:
When the Initiate action of a Transport Services API is
called without Messages being handed over, depending on the
protocols involved, it is not guaranteed that the Remote
Endpoint will be notified of this, and hence a passive
endpoint's application may not receive a ConnectionReceived event
until it receives the first Message the connection.
4.6. Handling connectionless protocols
The nuances of connectionless protocols also are discussed in
sec. 4.4.1 para. 1 and the part of the paragraph discussed above. It
may be an improvement to gather all that information into this
section.
Also, these considerations apply to any handshake-less protocol.
E.g. there probably is no guarantee that a server will accept another
stream on a multiplexed connection. So it may be worth introducing
the class "handshake-less protocol" explicitly.
To mitigate this, an
application can use a Message Framer (Section 6) on top of a
connectionless protocol to only mark a specific connection attempt as
ready when some data has been received, or after some application-
level handshake has been performed.
Of course, a Message Framer is just another protocol layered on top.
In this instance, the point is that it presents to the application a
protocol with handshake but it uses a protocol without a handshake.
You might want to state that explicitly.
4.7.3. Implementing listeners for Multiplexed Protocols
If the
abstraction of Connection presented to the application is mapped to
the multiplexed stream, then the Listener should deliver new
Connection objects in the same way for either case.
What controls the condition "If the ... Connection is mapped to
[i.e. represents] a multiplexed stream ..."? It seems to me that
there is either a blanket rule (e.g. SCTP streams are always presented
as individual Connections) or there is a setting in the Listener. In
either case, this is described somewhere in the TAPS API definition,
and a reference should be given here.
The
implementation should allow the application to introspect the
Connection Group marked on the Connections to determine the grouping
of the multiplexing.
The meaning of "should allow" is not clear. I would expect that the
API is specified to set Connection Groups for all sets of Connections
that share a multiplexed lower-level protocol, and the API has a
defined mechanism for accessing the Connection Group of a Connection.
For some multiplexed protocols (e.g. QUIC), once the connection is
established, either end can initiate new streams. This means that the
TAPS client may need a way to provide Listener service. Does the TAPS
API define how this happens? And if so, have you checked that the
various passages of this document that refer to Listeners cover
client-end Listeners correctly?
5.1.2. Send Completion
The application should be notified whenever a Message or partial
Message has been consumed by the Protocol Stack, or has failed to
send.
Naively, it seems that notifying the application for every
successfully sent message seems to be high overhead. Compare with the
first sentence of sec. 5.1.3 which suggests that a context switch per
Message sent may be excessive. Or is the meaning "The application can
request to be notified ..."? Also, presumably the API callback for
this has been defined, and its name should be provided here.
The
time at which a message failed to send is when Transport Services
implementation (including the Protocol Stack) has not successfully
sent the entire Message content or partial Message content on any
open candidate connection; this can depend on protocol-specific
timeouts.
The wording here seems to be poor, but the best I can suggest is "when
the implementation's attempt to send ... has failed".
5.2. Receiving Messages
If the top-level protocol only
supports a byte-stream and no framers were supported, the application
can control the flow of received data by specifying the minimum
number of bytes of Message content it wants to receive at one time.
Naively, I would expect this sentence to say "specifying the maximum
number of bytes", but perhaps the sentence as written is also correct.
Perhaps the application can specify both? Please check against the
API definition and update if necessary.
5.3. Handling of data for fast-open protocols
The text needs to clarify how 0-RTT data is handled by the API.
Naively, I would expect that 0-RTT data must be supplied with the
connection request, since it is sent with the handshake, after all.
However, sec. 5.3 para. 3 sent. 2 suggests that 0-RTT data that is
provided to the API needs to be marked with specific properties that
all 0-RTT would have to be marked with, implying that the API is
provided with the data in a separate call from the connection request
call. How 0-RTT data is provided in API call(s) should be made clear
at the beginning of para. 3, as that is the background for everything
following.
An implementation can set this property
according to the protocols that it will race based on the given
Selection Properties when the application requests to establish a
connection.
How is the zeroRttMsgMaxLen value derived from the zeroRttMsgMaxLen
properties of the individual protocols that will potentially be used by
connection establishment? I'd guess that it is the minimum of them,
but in that case, if both a fast-open and a non-fast-open protocol
might be used, then zeroRttMsgMaxLen will be 0 and the application
will never get the benefit of fast-open. OTOH, if zeroRttMsgMaxLen is
0, what happens if it is the non-fast-open protocol that succeeds?
In any case, if zeroRttMsgMaxLen is returned by the connection
establishment API request, how does the application provide the 0-RTT
data before the first connection request is sent?
In another aspect, the set of protocols that will be raced will only
be determined *during* the connection establishment process. E.g. the
usable protocols may depend on the host addresses. There is no way in
general for the API call to know which protocols might be used unless
either the API call finishes after the needed information is known, or
if the API call assumes that all protocols the implementation supports
might be used.
It is also possible for Protocol Stacks within a particular leaf node
to use 0-RTT handshakes without any safely replayable application
data if a protocol in the stack has idempotent handshake data to
send. For example, TCP Fast Open could use a Client Hello from TLS
as its 0-RTT data, shortening the cumulative handshake time.
This should be clarified. The data in question isn't 0-RTT data from
the *application's* perspective, and so that name shouldn't be applied
here without qualification. Perhaps
It is also possible for a protocol implementation in a Protocol
Stack for a particular leaf node to use 0-RTT data in a lower-level
protocol to carry its own handshakes; in such cases, whether the
data involved is safely replayable is determined by the protocol
implementation that generates the handshake data, not the
application (which does not generate the data). For example, TCP
Fast Open could use a Client Hello from TLS as its 0-RTT data,
shortening the cumulative handshake time.
6. Implementing Message Framers
Of course, a Message Framer is just another protocol layered on top of
a protocol stack, and I think it would be helpful if that was stated
early in this section.
But there are a lot of details of the situation that I think you want
to be clearer about. Does the TAPS specification define and require
support of a particular set of Message Framers? Applications will
generally take the defensive stance of not using Message Framers that
aren't required to be present.
Also, it's not clear to me to what degree this discussion is connected
to the API. In particular, does the TAPS API define how a Message
Framer interacts with the rest of the implementation? In that case,
the details in this section are normative. But if the idea is that
TAPS does not define such an API but that a TAPS implementation would
want to provide an API for "user" Message Framers, then this section
is a generic discussion of implementation considerations. Either way,
the intention should be stated clearly at the beginning.
E.g.
[...] these are ways for
applications or application frameworks to define their own Message
parsing to be included within a Connection's Protocol Stack.
suggests that allowing applications to specify their own custom
Message Framers is a part of the TAPS API definition, whereas
This section describes one possible API for defining
Message Framers, as an example.
suggests that there is no definition of such a facility (and so it is
an optional, non-standardized, extension in any implementation that
does have it).
7. Implementing Connection Management
If an error is encountered in setting a property (for example, if the
application tries to set a TCP-specific property on a Connection that
is not using TCP), the action should fail gracefully. The
application may be informed of the error, but the Connection itself
should not be terminated.
An important situation is if the application sets a property before
connection establishment is complete. In that case, the
implementation cannot tell whether the property is applicable to the
protocol that will eventually be chosen or not. So either the
implementation needs to reject the operation -- this requires that the
API clearly separates "general" from "protocol specific" properties,
allowing the former to be modified during connection establishment but
not the latter -- or the implementation has to store the property
value for the potential future use of the protocol -- this requires
that the implementation store *all* property values, even the ones
that are not applicable to all protocols. In either case, sec. 7
paras. 2 and 3 don't quite describe the situation, and the API
definition needs to make clear which of these situations applies.
7.1. Pooled Connection
The Transport Services API should allow protocol instances in the
Protocol Stack to pass up arbitrary generic or protocol-specific
errors that can be delivered to the application as Soft Errors.
Strictly speaking, the implementation should allow it, but (in order
to make any sense) the API definition *does* allow it, and defines how
it is to be done (as seen by the application). So saying "the ... API
should" isn't useful. Do you mean "The Transport Services
implementation SHOULD allow protocol instances ... errors, which will
be delivered to the application through the API."?
7.2. Handling Path Changes
If the device is able to rejoin a network
with the same IP address, a stateful transport connection can
generally resume. Thus, while it is useful for a Protocol Instance
to be aware of a temporary loss of connectivity, the Transport
Services implementation should not aggressively close connections in
these scenarios.
I think you want to expand on this a bit, and make the requirements
clearer, as in
(new paragraph) If the connection can be resumed in such a way that
its semantics are preserved across the path change, the Transport
Services implementation should not close the Connection. For
example, if the device is able to rejoin a network with the same IP
address, a TCP connection can generally resume operating, resending
any application data that was lost in transit. In the same
situation, a UDP connection can resume operating without resending
lost packets, because the semantics of the connection does not
specify reliable data delivery. But TCP cannot recover from loss
of the local IP address, as there is no way to ensure reliable
delivery of data that was in transit. Conversely, an HTTP/S client
that provides a request/response service to a specified server may
recover from loss of the local IP address, as long as the server
does not bind information (e.g. identity) to the client's address.
Thus, while it is useful for a Protocol Instance to be aware of a
temporary loss of connectivity, the Transport Services
implementation should not aggressively close connections in these
scenarios.
However that last point implies that the protocol alone may not be
sufficient information to determine whether reestablishment can use a
different local address; the application has to specify that as a
property.
8. Implementing Connection Termination
I think this section needs substantial revision. As far as I can
tell, the core of it is "Once you send a close on an SCTP connection,
any data in flight from the remote end will be lost, so in TAPS, the
semantics of a reliable, bidirectional, byte-stream connection are
that once the application calls Close, any data in flight from the
remote end MAY be lost." As a design choice, I would argue against it
(and that the implementation cover over that SCTP behavior), but I
assume the authors know better than I do.
However, it's critically important that this is stated in the API
definition, as the application writer might not expect that property,
and certainly can't be expected to read the implementation
documentation to discover that it doesn't get quite as good service as
TCP provides.
Once that is done, this section can be reduced to mentioning that
because the API permits the loss of tail-end data, a reliable,
bidirectional, byte-stream connection can be mapped directly to an
SCTP stream.
10. Specific Transport Protocol Considerations
Each protocol that is supported by a Transport Services
implementation should have a well-defined API mapping.
This seems to be an odd statement. Of course, each of the common
protocols MUST have a well-defined API mapping, but also, that mapping
MUST be specified in the API definition document, or otherwise
applications couldn't use it in a standardized way. So why is the word
"should" used, and why is there not a reference at this point to the
API document that specifies these mappings?
Each protocol has a notion of Connectedness. Possible values for
Connectedness are:
"values" isn't the right word. Perhaps "Possible definitions of
Connectedness for various types of protocols are:"
I notice that the following protocols are mentioned in the document,
some seemingly as examples, but are not listed in sec. 10. What is
their status? Does the API definition state how to use them? If not,
why are they mentioned in the implementation document?
DTLS
HTTP
HTTP2/TLS/TCP
HTTP3/QUIC/UDP
QUIC
The absence of discussion of any of the HTTP request/response
protocols is particularly worrisome, as it suggests that there is no
defined way to use the API to use HTTP, and yet people write as if
implementations will support it.
12.1. Considerations for Candidate Gathering
Implementations should avoid downgrade attacks that allow network
interference to cause the implementation to select less secure, or
entirely insecure, combinations of paths and protocols.
12.2. Considerations for Candidate Racing
Implementations should ensure that all options have equivalent
security properties to avoid incentivizing attacks.
For 12.1, of course implementations should use all "downgrade
avoidance" techniques that are specified for each protocol in the
protocol's standards. But more thought needs to be done about the
situation where the application specifies allowing a set of protocols
which, taken as a whole, has a downgrade problem. There are only two
solutions: (1) TAPS allows the application to specify a group of
protocols with unequal security properties; in which case, the
application shouldn't expect to get more security than the least
secure protocol in the group. (2) TAPS forbids the application to
specify a group of protocols with unequal security properties and
enforces that condition. Which obtains depends on the API definition,
but the implementation has no leeway in either case, and this document
ought to state the situation clearly.
14. References
RFC 6724 "Default Address Selection for Internet Protocol Version 6 (IPv6)"
may be relevant, and if so, should be added to the references.
[END]