Re: [Last-Call] Genart last call review of draft-ietf-taps-impl-15

Michael Welzl <michawe@xxxxxxxxxx> · Tue, 9 May 2023 07:39:48 -0400

Dear Dale,
Many thanks for this very thoughtful review!  (and sorry for the delay)
I’ll give some answers below, as an author of the -impl and an editor of the -interface document - I hope my co-authors will agree with me, or comment if they do not.

On Apr 13, 2023, at 8:32 PM, Dale Worley via Datatracker <noreply@xxxxxxxx> wrote:

Reviewer: Dale Worley
Review result: Ready with Issues

I am the assigned Gen-ART reviewer for this draft. The General Area
Review Team (Gen-ART) reviews all IETF documents being processed
by the IESG for the IETF Chair.  Please treat these comments just
like any other last call comments.

For more information, please see the FAQ at

<https://trac.ietf.org/trac/gen/wiki/GenArtfaq>.

Document:  draft-ietf-taps-impl-15
Reviewer:  Dale R. Worley
Review Date:  2023-04-13
IETF LC End Date:  2023-04-14
IESG Telechat date:  [unknown]

Summary:

    This draft is on the right track but has open issues, described in
    the review.

Major issues:

I find only one major issue, but it applies to the entire document.
It's not at all clear to me whether this is a request to just be more
careful about wording in various places, a request to be more explicit
about the normative status of various statements, or a concern about
the overall structure of the draft-ietf-taps-* documents.  That is,
the problems I observe could be anything from a minor wording issue to
evidence that the API definition has omitted specifying some important
aspect of the API.  The authors will have to assess that.

My assumption is that if there is an "interface" document and an
"implementation" document, then one can write an application that uses
the API without referring to the implementation document.  E.g. if one
had previously wanted to "establish a TCP connection to the server",
one would use the TAPS API to "establish a reliable, bidirectional,
byte-stream connection to the server", and this would work reliably
with any TAPS implementation that supports TCP in any environment
where such a TCP connection can be made.

Also, regardless of the implementation, the API would work in the same
way as seen by the application, as given by the API definition.

More subtly, also, the API behavior is unchanged by the implementation
choices TAPS makes for a particular connection, as long as the
connection is within the requirements presented by the application.
E.g. if in providing a "reliable, bidirectional, byte-stream
connection" an SCTP stream is also a possibility, and the application
has not forbidden the use of SCTP, the implementation might choose to
use an SCTP stream, but since the connection is "a reliable,
bidirectional, byte-stream connection", the application would not have
to adjust its use of the API nor would it see different behavior in
the API.  To distinguish how the implementation was handling the
connection, the application would have to explicitly query the
Connection object.

Everything you say here is 100% correct.

I take these as requirements for good design of the TAPS system.

Yes.

Reading the text, I find more modal/conditional/subjunctive words than
I expect, given that what the application can discern by interacting
with the API is fixed by the API definition, and the implementation is
allowed all variations which do not violate the API definition.  A
quick script gives this census:
    can: 127
    could: 13
    may: 68
    should: 92
    would: 10

Consider, for example, this statement in section 2:

   Once a Preconnection has been used to create an outbound Connection
   or a Listener, the implementation should ensure that the copy of the
   properties held by the Connection or Listener cannot be mutated by
   the application making changes to the original Preconnection object.

First, since "should" is not a proper key word, its normative status
is unclear.  But given that these behaviors are directly observable by
the application, I would expect this property to be a behavior that is
required by the API definition:  "Once a Connection or Listener is
created from a Preconnection, its properties MUST NOT be affected by
changes to the properties of the Preconnection."  Or, given that the
underlying protocol system may make adhering to that impossible, it is
likely that the API definition would use "SHOULD NOT".

I agree that such prescriptions must be covered by the API document, but I believe that they generally are - this is what we set out to do!
For instance, regarding the case you quote, the API document contains, in the section discussing Initiate()  (section 7.1):
***
The Initiate() Action returns a Connection object. Once Initiate() has been called, any changes to the Preconnection MUST NOT have any effect on the Connection. However, the Preconnection can be reused, e.g., to Initiate another Connection.
***

and in the section discussing Listen()   (section 7.2):
***
The Listen() Action returns a Listener object. Once Listen() has been called, any changes to the Preconnection MUST NOT have any effect on the Listener. The Preconnection can be disposed of or reused, e.g., to create another Listener.
***

So, the quoted text from the implementation draft is a mere repetition of the rules that the API prescribes. For implementers, such reminders make for a more useful reading experience than having to go back and check “what exactly am I allowed to do?” in the API document for each and every detail.

Let me repeat that:  I expect such a requirement in the API
definition, and the implementation document would discuss the
consequences of the requirement.

...but that’s what this is?  Since changes to the Preconnection MUST NOT have any effect on the Connection that was created with Initiate()  [interface draft], an implementation must ensure that the copy of the properties held by the Connection or Listener cannot be mutated by the application making changes to the original Preconnection object [implementation draft].

Note that I changed the wording from (non-capital) “should” to “must” here because it’s hard to see how an implementation could do this differently. So, this should be fixed.  And, indeed:

  And I expect the wording in the
implementation document to make it clear that it is discussing how to
implement a requirement in the API definition.  (In this case,
probably all that is needed is to change "should" to "SHOULD", but of
course that word would be copied from the requirement statement in the
API document.)

So to fix this, I recommend that the authors go through the text
looking for modal words, and for each one, verify that what the text
is discussing is not actually a requirement of the API that is missing
from the API definition, and then that the discussion in this document
correctly reflects the requirement in the API definition.

- that seems to be the right way to approach this. It is a very good suggestion, many thanks. Indeed, while the case you quote is covered by the API draft, it would be terrible if we’d miss any other requirement there.

As a way forward, my plan is to file issues in our github ( https://github.com/ietf-tapswg/api-drafts ) for the smaller things below, in addition to one extra issue that says “check all modal words in the implementation draft” - and then address them with PRs there. I hope that’s ok with you?

Thanks again!

Cheers,
Michael

The smaller issues that I have specific comments about are listed below.

Minor/editorial/nit issues:

Abstract

   This document serves as a guide to implementation on how to
   build such a system.

The phrases "implementation" and "how to build" seem redundant.
Perhaps "guide to implementing such a system"?  (Similarly, sec. 1,
para. 2, sent. 1.)

1.  Introduction

I expect the text of the introduction to be clearer on the normative
status of this text.  I would expect the implementation document to
have few normative statements beyond what are copied from the API
definition.  But it's possible to have details like the specifics of
candidate choice have SHOULD statements that aren't in the API
definition, as those behaviors aren't directly visible in the API
behavior and may be heavily affected by a particular implementation
environment.

2.  Implementing Connection Objects

   The properties held by a
   Connection or Listener are independent of other connections that are
   not part of the same Connection Group.

First, "connections" should be "Connections".

Indeed, it appears that the properties are independent of connections
in the same Connection Group, too, as properties are inherited from
the Preconnection and then fixed thereafter.

But I think my issue is what is meant by "properties".  I suspect that
the API model has two sorts of properties, one sort are the ones
inherited from the Preconnection and one that are set by the process
of connecting (e.g., the local ephemeral port), and another sort that
are not from the Preconnection but are shared among Connections in a
Connection Group.  If so, this distinction needs to be made explicit.
But it's still not clear to me how these connection properties can be
changed once as Connection is established, even by another Connection
in a Connection Group.

   Once Initiate has been called, the Selection Properties and Endpoint
   information are immutable (i.e, an application is not able to later
   modify Selection Properties on the original Preconnection object).
   Listener objects are created with a Preconnection, at which point
   their configuration should be considered immutable by the
   implementation.

I had some difficulty reading this paragraph.  After thinking about
it, it seems that all of it is about listening, but that is not made
clear at the beginning, and I assumed it was continuing the discussion
of connection establishment in the previous paragraph.

Within that context, sec. 2 states that when a Listener is created,
all applicable properties are copied from the Precondition, and no
later changes to the Precondition change the Listener.  But the first
sentence says that the Precondition object cannot be changed, which
makes no sense.

3.  Implementing Pre-Establishment

It appears that pre-establishment is the creating/updating of a
Preconnection object (and that Endpoints, Connection Properties, and
Capacity Profile are data within the Preconnection object), but this
should be stated directly and as early as possible in this section.

3.1.  Configuration-time errors

3.2.  Role of system policy

   Lastly, the implementation itself may default to disallowing certain
   network interfaces unless explicitly requested by the application and
   allowed by the system.

I believe the point being made by this sentence would be clearer if
"and allowed by the system" was omitted.  The meaning that phrase adds
is included in the meanings of the previous sentences.  Omitting that
phrase emphasizes the meaning that policy may require the application
to specifically request certain resources in order to obtain them (as
opposed to them being available whenever they are not explicitly
forbidden) -- which is a pattern that is not all that common in APIs
and so should be stated explicitly.

In regard to normativity, you probably want "MAY".  And I think it's
reasonable to have this statement only in the implementation document,
and not in the API definition, as the process of determining what
resources are allocated to a particular application request
"inherently" involves details of the implementation and environment
that an API definition cannot specify exactly.

   An
   implementation should attempt to look up the relevant policies for
   the system in a dynamic way to make sure it is reflecting an accurate
   version of the system policy, since the system's policy regarding the
   application's traffic may change over time due to user or
   administrative changes.

Two questions arise:  (1) Once a Connection is created, its transport
properties likely cannot be altered.  Does that mean that the last
moment at which policy can be applied to a Connection is when it is
created?  (2) This statement conflicts with "To avoid allocating
resources that are not finally needed, it is important that
configuration-time errors fail as early as possible." because if
policy is looked up dynamically, it's possible to create a
Preconnection that conflicts with policy at the time it is created,
but the policy then changes, and after that a Connection can be
created using the Preconnection without conflicting with policy.

Obviously, there is no simple, definitive solution to these problems,
but this is probably a good place in the document to point out the
complexities and the API features that enable an application to deal
with them.  E.g. I would expect that an application can request a
Preconnection that conflicts with the current policy, and by default,
that returns an error, so that simple/naive applications see the
problem and abort as early as possible.  But if the application
explicitly ignores the error, it can obtain such a Preconnection and,
under the expectation that it will become compatible with policy,
attempt to use it later.

Similarly, when creating a Connection, there is a possible error "the
Preconnection conflicts with policy right now, even though it did not
when it was created".  And there needs to be some sort of "policy has
changed and the Connection is now violating policy" error.

These error returns and how an application differentiates them from
other errors should be specified in the API definition.

4.  Implementing Connection Establishment

   For ease of illustration, this document structures the candidates for
   racing as a tree (see Section 4.1).  This is not meant to restrict
   implementations from structuring racing candidates differently.

   Any one of these sub-entries on the aggregate connection attempt
   would satisfy the original application intent.  The concern of this
   section is the algorithm defining which of these options to try,
   when, and in what order.

   During Candidate Gathering (Section 4.2), an implementation prunes
   and sorts branches according to the Selection Property preferences
   (Section 6.2 of [I-D.ietf-taps-interface].  It first excludes all
   protocols and paths that match a Prohibit property or do not match
   all Require properties.  Then it will sort branches according to
   Preferred properties, Avoided properties, and possibly other
   criteria.

This section needs to clarify which statements are normative and which
are not.  Naively, I would expect that the implementation is free to
implement connection selection in whatever way it wants, and this
section is just a suggestion.  But "Then it will sort branches
according to Preferred properties, Avoided properties ..." makes it
clear that the API definition requires the implementation to take into
account some properties.  In that context, what are the limitations on
the allowed selection algorithms?  Particularly, if an algorithm does
not organize candidates as a tree, what is the import of "sort
branches according to Preferred properties, Avoided properties"?
Also, using SHOULD and MUST would clarify things.

4.1.  Structuring Candidates as a Tree

   The parent (or trunk) node of the tree will be represented by
   a single integer, such as "1".

My understanding is that the usual term is "root node".

   As noted above, the consideration of multiple candidates in a
   gathering and racing process can be conceptually structured as a
   tree; this terminological convention is used throughout this
   document.

   In protocol stacks, the layers are
   separated by '/' and ordered top-down.

Given that the tree structure has an up-down dimension and this
sentence is not referring to that dimension, it might be clearer to
say that the designations of protocol stack layers are "ordered
top-layer-first" since this document writes them that way
(e.g. "HTTP/TCP").

   A connection establishment tree may be degenerate, and only have a
   single leaf node, such as a connection attempt to an IP address over
   a single interface with a single protocol.

Given that "degenerate" is used nowhere else, this could be simplified
to "A connection establishment tree may consist of only a single leaf
node, such as a connection attempt to a specified IP address over a
single interface with a single protocol."

4.1.2.  Branching Order-of-Operations

   For example, if the application has indicated both a preference for
   WiFi over LTE and for a feature only available in SCTP, branches will
   be first sorted accord to path selection, with WiFi at the top.
   Then, branches with SCTP will be sorted to the top within their
   subtree according to the properties influencing protocol selection.

I find the use of "top" confusing here.  Given we are talking about
trees, I consider "top" to mean closer to the root of the tree.  Here,
it appears to mean "earlier in the set of children of a parent node".
I think "first" would be better, as we are arranging the children
under a parent node in what is usually shows as left-to-right order
(or actually, into the time-order in which the children will be
tried).

Note this trouble comes from the compact notation used for trees,
which causes the typographical top-bottom dimension to be used both
for the top-bottom axis of the layers of the tree and for the
sequencing of the children of a mode.  Make sure people aren't
confused if they continue to think about trees in the usual graphical
presentation.

4.1.3.  Sorting Branches

   Implementations should sort the branches of the tree of connection
   options in order of their preference rank, from most preferred to
   least preferred.  Leaf nodes on branches with higher rankings
   represent connection attempts that will be raced first.
   Implementations should order the branches to reflect the preferences
   expressed by the application for its new connection, including
   Selection Properties, which are specified in
   [I-D.ietf-taps-interface].

The first and third sentences of this paragraph are largely the same;
can they be combined?

4.3.  Candidate Racing

   However, an implementation is unable to know the full tree
   before it is formed [...]

This is probably not phrased well.  Strictly, it is a tautology, you
can't know the full tree until you know it.  I suspect the meaning is
that an implementation may want to start racing candidates before the
full tree is known.

   Any timer or racing logic is isolated to a
   given parent node, and is not ordered precisely with regards to other
   children of other nodes.

"other children of other nodes" s.b. "children of other nodes".

4.3.3.  Failover

   An example in which failover is recommended is a race between a
   Protocol Stack that uses a proxy and a Protocol Stack that bypasses
   the proxy.  Failover is useful in case the proxy is down or
   misconfigured, but any more aggressive type of racing may end up
   unnecessarily avoiding a proxy that was preferred by policy.

This could be clarified.  I started reading the paragraph assuming
that a connection without a proxy would always be preferred to one
with a proxy.  However, in this example the opposite is true, and that
should be revealed at the beginning.  Perhaps

   An example in which failover is recommended is a race where a
   Protocol Stack that uses a proxy is preferred to a Protocol Stack
   that bypasses the proxy.  Failover is useful in case the proxy is
   down or misconfigured, but any more aggressive type of racing may
   end up avoiding the proxy when it could have been used.

4.4.1.  Determining Successful Establishment

   If the only protocol being used is a transport protocol
   with a clear handshake, like TCP, then the obvious choice is to
   declare that node "connected" when the last packet of the three-way
   handshake has been received.

We are discussing behavior of the client and the client does not know
when the last (ACK) packet of the three-way handshake has been
received because it does not receive that packet.  You want to say
"... has been transmitted." since transmitting that packet is the last
thing the client does during TCP connection establishment.

4.5.  Establishing multiplexed connections

   Multiplexing several Connections over a single underlying transport
   connection requires that the Connections to be multiplexed belong to
   the same Connection Group (as is indicated by the application using
   the Clone call).  When the underlying transport connection supports
   multi-streaming, the Transport Services System can map each
   Connection in the Connection Group to a different stream.  Thus, when
   the Connections that are offered to an application by the Transport
   Services API are multiplexed, the Transport Services implementation
   can establish a new Connection by simply beginning to use a new
   stream of an already established transport Connection and there is no
   need for a connection establishment procedure.  This, then, also
   means that there may not be any "establishment" message (like a TCP
   SYN), but the application can simply start sending or receiving.
   Therefore, when the Initiate action of a Transport Services API is
   called without Messages being handed over, it cannot be guaranteed
   that the Remote Endpoint will have any way to know about this, and
   hence a passive endpoint's ConnectionReceived event might not be
   delivered until data is received.  Instead, delivering the
   ConnectionReceived event could be delayed until the first Message
   arrives.

This should be clarified, I think.  The first part is fine:

   Multiplexing several Connections over a single underlying transport
   connection requires that the Connections to be multiplexed belong to
   the same Connection Group (as is indicated by the application using
   the Clone call).  When the underlying transport connection supports
   multi-streaming, the Transport Services System can map each
   Connection in the Connection Group to a different stream.

The next part is not quite correct:

   Thus, when
   the Connections that are offered to an application by the Transport
   Services API are multiplexed, the Transport Services implementation
   can establish a new Connection by simply beginning to use a new
   stream of an already established transport Connection and there is no
   need for a connection establishment procedure.  This, then, also
   means that there may not be any "establishment" message (like a TCP
   SYN), but the application can simply start sending or receiving.

What is really going on is that the protocol has two layers, a lower
one that creates the connection over which streams are multiplexed and
a higher one that is the stream which is multiplexed.  What is being
described is when the upper layer has no explicit handshake to
establish a new stream.  But that is just another case of the
situation with UDP discussed in sec. 4.4.1 para. 1.  I think a better
phrasing would be as follows (and probably reads better as a separate
paragraph):

   Thus, when the Connections that are offered to an application by
   the Transport Services API are multiplexed, the Transport Services
   implementation can establish a new Connection by using a new stream
   of an already established transport Connection.  Effectively, the
   streams are an additional protocol layer on top of the transport
   connection, and for many such there is no explicit connection
   establishment procedure for the new stream prior to sending data on
   it.  In this case, the same considerations apply to determining
   stream establishment as apply to establishing a UDP connection, as
   discussed in section 4.4.1.

The final part seems to be correct:

   Therefore, when the Initiate action of a Transport Services API is
   called without Messages being handed over, it cannot be guaranteed
   that the Remote Endpoint will have any way to know about this, and
   hence a passive endpoint's ConnectionReceived event might not be
   delivered until data is received.  Instead, delivering the
   ConnectionReceived event could be delayed until the first Message
   arrives.

but (1) it could be clarified, and (2) it applies generally to
non-handshake protocols, and so probably should be relocated to
e.g. sec. 4.4.1 between paragraphs 1 and 2, where we first discuss the
nuances of handshake-less protocols.  Perhaps these adjustments to the
wording:

   When the Initiate action of a Transport Services API is
   called without Messages being handed over, depending on the
   protocols involved, it is not guaranteed that the Remote
   Endpoint will be notified of this, and hence a passive
   endpoint's application may not receive a ConnectionReceived event
   until it receives the first Message the connection.

4.6.  Handling connectionless protocols

The nuances of connectionless protocols also are discussed in
sec. 4.4.1 para. 1 and the part of the paragraph discussed above.  It
may be an improvement to gather all that information into this
section.

Also, these considerations apply to any handshake-less protocol.
E.g. there probably is no guarantee that a server will accept another
stream on a multiplexed connection.  So it may be worth introducing
the class "handshake-less protocol" explicitly.

   To mitigate this, an
   application can use a Message Framer (Section 6) on top of a
   connectionless protocol to only mark a specific connection attempt as
   ready when some data has been received, or after some application-
   level handshake has been performed.

Of course, a Message Framer is just another protocol layered on top.
In this instance, the point is that it presents to the application a
protocol with handshake but it uses a protocol without a handshake.
You might want to state that explicitly.

4.7.3.  Implementing listeners for Multiplexed Protocols

   If the
   abstraction of Connection presented to the application is mapped to
   the multiplexed stream, then the Listener should deliver new
   Connection objects in the same way for either case.

What controls the condition "If the ... Connection is mapped to
[i.e. represents] a multiplexed stream ..."?  It seems to me that
there is either a blanket rule (e.g. SCTP streams are always presented
as individual Connections) or there is a setting in the Listener.  In
either case, this is described somewhere in the TAPS API definition,
and a reference should be given here.

   The
   implementation should allow the application to introspect the
   Connection Group marked on the Connections to determine the grouping
   of the multiplexing.

The meaning of "should allow" is not clear.  I would expect that the
API is specified to set Connection Groups for all sets of Connections
that share a multiplexed lower-level protocol, and the API has a
defined mechanism for accessing the Connection Group of a Connection.

For some multiplexed protocols (e.g. QUIC), once the connection is
established, either end can initiate new streams.  This means that the
TAPS client may need a way to provide Listener service.  Does the TAPS
API define how this happens?  And if so, have you checked that the
various passages of this document that refer to Listeners cover
client-end Listeners correctly?

5.1.2.  Send Completion

   The application should be notified whenever a Message or partial
   Message has been consumed by the Protocol Stack, or has failed to
   send.

Naively, it seems that notifying the application for every
successfully sent message seems to be high overhead.  Compare with the
first sentence of sec. 5.1.3 which suggests that a context switch per
Message sent may be excessive.  Or is the meaning "The application can
request to be notified ..."?  Also, presumably the API callback for
this has been defined, and its name should be provided here.

   The
   time at which a message failed to send is when Transport Services
   implementation (including the Protocol Stack) has not successfully
   sent the entire Message content or partial Message content on any
   open candidate connection; this can depend on protocol-specific
   timeouts.

The wording here seems to be poor, but the best I can suggest is "when
the implementation's attempt to send ... has failed".

5.2.  Receiving Messages

   If the top-level protocol only
   supports a byte-stream and no framers were supported, the application
   can control the flow of received data by specifying the minimum
   number of bytes of Message content it wants to receive at one time.

Naively, I would expect this sentence to say "specifying the maximum
number of bytes", but perhaps the sentence as written is also correct.
Perhaps the application can specify both?  Please check against the
API definition and update if necessary.

5.3.  Handling of data for fast-open protocols

The text needs to clarify how 0-RTT data is handled by the API.
Naively, I would expect that 0-RTT data must be supplied with the
connection request, since it is sent with the handshake, after all.
However, sec. 5.3 para. 3 sent. 2 suggests that 0-RTT data that is
provided to the API needs to be marked with specific properties that
all 0-RTT would have to be marked with, implying that the API is
provided with the data in a separate call from the connection request
call.  How 0-RTT data is provided in API call(s) should be made clear
at the beginning of para. 3, as that is the background for everything
following.

   An implementation can set this property
   according to the protocols that it will race based on the given
   Selection Properties when the application requests to establish a
   connection.

How is the zeroRttMsgMaxLen value derived from the zeroRttMsgMaxLen
properties of the individual protocols that will potentially be used by
connection establishment?  I'd guess that it is the minimum of them,
but in that case, if both a fast-open and a non-fast-open protocol
might be used, then zeroRttMsgMaxLen will be 0 and the application
will never get the benefit of fast-open.  OTOH, if zeroRttMsgMaxLen is
0, what happens if it is the non-fast-open protocol that succeeds?

In any case, if zeroRttMsgMaxLen is returned by the connection
establishment API request, how does the application provide the 0-RTT
data before the first connection request is sent?

In another aspect, the set of protocols that will be raced will only
be determined *during* the connection establishment process.  E.g. the
usable protocols may depend on the host addresses.  There is no way in
general for the API call to know which protocols might be used unless
either the API call finishes after the needed information is known, or
if the API call assumes that all protocols the implementation supports
might be used.

   It is also possible for Protocol Stacks within a particular leaf node
   to use 0-RTT handshakes without any safely replayable application
   data if a protocol in the stack has idempotent handshake data to
   send.  For example, TCP Fast Open could use a Client Hello from TLS
   as its 0-RTT data, shortening the cumulative handshake time.

This should be clarified.  The data in question isn't 0-RTT data from
the *application's* perspective, and so that name shouldn't be applied
here without qualification.  Perhaps

   It is also possible for a protocol implementation in a Protocol
   Stack for a particular leaf node to use 0-RTT data in a lower-level
   protocol to carry its own handshakes; in such cases, whether the
   data involved is safely replayable is determined by the protocol
   implementation that generates the handshake data, not the
   application (which does not generate the data).  For example, TCP
   Fast Open could use a Client Hello from TLS as its 0-RTT data,
   shortening the cumulative handshake time.

6.  Implementing Message Framers

Of course, a Message Framer is just another protocol layered on top of
a protocol stack, and I think it would be helpful if that was stated
early in this section.

But there are a lot of details of the situation that I think you want
to be clearer about.  Does the TAPS specification define and require
support of a particular set of Message Framers?  Applications will
generally take the defensive stance of not using Message Framers that
aren't required to be present.

Also, it's not clear to me to what degree this discussion is connected
to the API.  In particular, does the TAPS API define how a Message
Framer interacts with the rest of the implementation?  In that case,
the details in this section are normative.  But if the idea is that
TAPS does not define such an API but that a TAPS implementation would
want to provide an API for "user" Message Framers, then this section
is a generic discussion of implementation considerations.  Either way,
the intention should be stated clearly at the beginning.

E.g.

   [...] these are ways for
   applications or application frameworks to define their own Message
   parsing to be included within a Connection's Protocol Stack.

suggests that allowing applications to specify their own custom
Message Framers is a part of the TAPS API definition, whereas

   This section describes one possible API for defining
   Message Framers, as an example.

suggests that there is no definition of such a facility (and so it is
an optional, non-standardized, extension in any implementation that
does have it).

7.  Implementing Connection Management

   If an error is encountered in setting a property (for example, if the
   application tries to set a TCP-specific property on a Connection that
   is not using TCP), the action should fail gracefully.  The
   application may be informed of the error, but the Connection itself
   should not be terminated.

An important situation is if the application sets a property before
connection establishment is complete.  In that case, the
implementation cannot tell whether the property is applicable to the
protocol that will eventually be chosen or not.  So either the
implementation needs to reject the operation -- this requires that the
API clearly separates "general" from "protocol specific" properties,
allowing the former to be modified during connection establishment but
not the latter -- or the implementation has to store the property
value for the potential future use of the protocol -- this requires
that the implementation store *all* property values, even the ones
that are not applicable to all protocols.  In either case, sec. 7
paras. 2 and 3 don't quite describe the situation, and the API
definition needs to make clear which of these situations applies.

7.1.  Pooled Connection

   The Transport Services API should allow protocol instances in the
   Protocol Stack to pass up arbitrary generic or protocol-specific
   errors that can be delivered to the application as Soft Errors.

Strictly speaking, the implementation should allow it, but (in order
to make any sense) the API definition *does* allow it, and defines how
it is to be done (as seen by the application).  So saying "the ... API
should" isn't useful.  Do you mean "The Transport Services
implementation SHOULD allow protocol instances ... errors, which will
be delivered to the application through the API."?

7.2.  Handling Path Changes

   If the device is able to rejoin a network
   with the same IP address, a stateful transport connection can
   generally resume.  Thus, while it is useful for a Protocol Instance
   to be aware of a temporary loss of connectivity, the Transport
   Services implementation should not aggressively close connections in
   these scenarios.

I think you want to expand on this a bit, and make the requirements
clearer, as in

   (new paragraph) If the connection can be resumed in such a way that
   its semantics are preserved across the path change, the Transport
   Services implementation should not close the Connection.  For
   example, if the device is able to rejoin a network with the same IP
   address, a TCP connection can generally resume operating, resending
   any application data that was lost in transit.  In the same
   situation, a UDP connection can resume operating without resending
   lost packets, because the semantics of the connection does not
   specify reliable data delivery.  But TCP cannot recover from loss
   of the local IP address, as there is no way to ensure reliable
   delivery of data that was in transit.  Conversely, an HTTP/S client
   that provides a request/response service to a specified server may
   recover from loss of the local IP address, as long as the server
   does not bind information (e.g. identity) to the client's address.
   Thus, while it is useful for a Protocol Instance to be aware of a
   temporary loss of connectivity, the Transport Services
   implementation should not aggressively close connections in these
   scenarios.

However that last point implies that the protocol alone may not be
sufficient information to determine whether reestablishment can use a
different local address; the application has to specify that as a
property.

8.  Implementing Connection Termination

I think this section needs substantial revision.  As far as I can
tell, the core of it is "Once you send a close on an SCTP connection,
any data in flight from the remote end will be lost, so in TAPS, the
semantics of a reliable, bidirectional, byte-stream connection are
that once the application calls Close, any data in flight from the
remote end MAY be lost."  As a design choice, I would argue against it
(and that the implementation cover over that SCTP behavior), but I
assume the authors know better than I do.

However, it's critically important that this is stated in the API
definition, as the application writer might not expect that property,
and certainly can't be expected to read the implementation
documentation to discover that it doesn't get quite as good service as
TCP provides.

Once that is done, this section can be reduced to mentioning that
because the API permits the loss of tail-end data, a reliable,
bidirectional, byte-stream connection can be mapped directly to an
SCTP stream.

10.  Specific Transport Protocol Considerations

   Each protocol that is supported by a Transport Services
   implementation should have a well-defined API mapping.

This seems to be an odd statement.  Of course, each of the common
protocols MUST have a well-defined API mapping, but also, that mapping
MUST be specified in the API definition document, or otherwise
applications couldn't use it in a standardized way.  So why is the word
"should" used, and why is there not a reference at this point to the
API document that specifies these mappings?

   Each protocol has a notion of Connectedness.  Possible values for
   Connectedness are:

"values" isn't the right word.  Perhaps "Possible definitions of
Connectedness for various types of protocols are:"

I notice that the following protocols are mentioned in the document,
some seemingly as examples, but are not listed in sec. 10.  What is
their status?  Does the API definition state how to use them?  If not,
why are they mentioned in the implementation document?
    DTLS
    HTTP
    HTTP2/TLS/TCP
    HTTP3/QUIC/UDP
    QUIC

The absence of discussion of any of the HTTP request/response
protocols is particularly worrisome, as it suggests that there is no
defined way to use the API to use HTTP, and yet people write as if
implementations will support it.

12.1.  Considerations for Candidate Gathering

   Implementations should avoid downgrade attacks that allow network
   interference to cause the implementation to select less secure, or
   entirely insecure, combinations of paths and protocols.

12.2.  Considerations for Candidate Racing

   Implementations should ensure that all options have equivalent
   security properties to avoid incentivizing attacks.

For 12.1, of course implementations should use all "downgrade
avoidance" techniques that are specified for each protocol in the
protocol's standards.  But more thought needs to be done about the
situation where the application specifies allowing a set of protocols
which, taken as a whole, has a downgrade problem.  There are only two
solutions:  (1) TAPS allows the application to specify a group of
protocols with unequal security properties; in which case, the
application shouldn't expect to get more security than the least
secure protocol in the group.  (2) TAPS forbids the application to
specify a group of protocols with unequal security properties and
enforces that condition.  Which obtains depends on the API definition,
but the implementation has no leeway in either case, and this document
ought to state the situation clearly.

14.  References

RFC 6724 "Default Address Selection for Internet Protocol Version 6 (IPv6)"
may be relevant, and if so, should be added to the references.

[END]

-- 
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call