Re: [Last-Call] Tsvart last call review of draft-ietf-mmusic-t140-usage-data-channel-11

Christer Holmberg <christer.holmberg=40ericsson.com@xxxxxxxxxxxxxx> · Thu, 30 Jan 2020 09:51:59 +0000

Hi,

See inline [CH].

 ---------

    > 1/ I wonder why Section 4.2.1 does not include any normative statements on how
    > to handle the maximum character transmission rate ('cps' attribute). RFC 4103
    > states that "In receipt of this parameter, devices MUST adhere to the request
    > by transmitting characters at a rate at or below the specified <integer>
    > value." Isn't a similar statement needed in this document?

The assumption has been that the associated procedures in 4103 apply.
Note that section 6 in RFC 4103 continues:

    "  Note that this parameter was not defined in https://tools.ietf.org/html/rfc2793 [https://tools.ietf.org/html/rfc4103#ref-16].
       Therefore implementations of the text/t140 format may be in use that
       do not recognize and act according to this parameter.  Therefore,
       receivers of text/t140 MUST be designed so they can handle temporary
       reception of characters at a higher rate than this parameter
       specifies.  As a result malfunction due to buffer overflow is avoided
       for text conversation with human input. "

This note may be historic now, but for the T140-usage draft we may have the similar case when an implementation has not succeeded
to implement support for the CPS parameter, or for dcsa at all.

[MS] I have also read that paragraph, but to me it sounded as a historic transition problem. Also, as far as I read this text in 4103, the
MUST is only about the design of the implementation, not about its actual operation. For instance, reception at a higher rate could
be an optional feature turned off by default may not violate that MUST.

We had specific wording for that case earlier, but I think we deleted most of it. 
Do you think we should insert a similar precaution in the t140-usage draft, but referring to other reasons than RFC 2793 interop?

[MS] I don’t understand why the draft specifies a parameter if implementations shall just ignore the value. I guess the actual
use of this parameter could be better specified in the document.

[CH] I don't think we are saying that one shall ignore the value. We are simply realizing the fact that there may be legacy implementations that don't support the value.
In any case, I don't think this is a real-life problem - in cases where T.140 is used as intended. As Gunner mentioned, T.140 is not designed or intended for transport of large amount of copy-pasted text but for human interactions.

-----

    > 2/ Also, it is not really clear from the document what would happen if a peer
    > exceeds this maximum character transmission rate (or the rate allowed by
    > congestion/flow control). What happens if the sender types faster than the
    > 'cps' attribute (say, an automated chat bot)? I guess characters would be
    > dropped at the sender? In that case, no missing text markers would be displayed
    > in the receiver, right?

I assume it could result in a buffer overflow sooner or later, but I think it is a local implementation issue how it is dealt with by the sender application. 

Perhaps Gunnar knows more about how implementations handle this?

What the CPS parameter tries to prevent is buffer overflow in slow ancient receiving devices. 
Modern implementations will likely have buffer space enough for storing quite a large volume of text for eventual transmission to the limited device. It is instead another risk that appears. If you have a receiving device only able to present 4 characters per second which is the reality if you have interop with the old TTYs through a gateway, then a modern device user happening to paste a large piece of text or use speech to text technology will generate text in buffers that will take a very long time to present at the receiving device. The sense of a real-time conversation will be lost. (1000 characters will take 4 minutes to present!) 
The users need to realize that this technology is intended for human conversation. It is wise if implementations have the possibility for pasting in short texts, but it should not be used for document transfer. 

[MS] The document could better describe this, IMHO. As the data channel is different and has other capabilities, it is not clear to me that all limitations explained in RFC 4103 apply here as well.
So, yes, it is a local implementation issue.
[MS] From a transport protocol design perspective, the „cps“ parameter may realize some sort of (simple) flow control. And in that case sender and receiver typically have to agree on the semantics. And your example actually describes a typical flow control problem: If a 4 cps receiver receives 1000 characters, it may actually be interested in slowing down the sender. Could it do so, for instance, by sending cps=0? As far as I understand the I-D, this would be allowed. How to deal with that case is not a local implementation issue only. Well, we are here probably discussing corner cases of corner cases. But as far as I can see, the document does not comprehensively describe the use of the cps attribute.

[CH] If the receiver cannot handle the received characters, in my opinion it is better if it uses the direction attributes (4.2.3) to indicate that it is not willing to receive text. I guess we could add some text about that. Something like:

      "If the receiver receives text at a higher rate than it can handle, e.g., because the sender does not support the cps parameter, the receiver
        can indicate to the sender that it is not willing to receive more text using the direction attributes [ref-to-section-4.2.3]"

---

    > 3/ Section 5.3. "Data Buffering" includes the following statement: "As
    > described in [T140], buffering can be used to reduce overhead, with the maximum
    > buffering time being 500 ms.  It can also be used for staying within the
    > maximum character transmission rate (Section 4.2), if such has been provided by
    > the peer." I don't understand the second sentence. At first sight, enforcing
    > the 'cps' attribute does not only require a buffer, but also some sort of rate
    > shaper/policer (e.g., token bucket or the like). Do I miss something?

The 2nd sentence talks about the case when the user input rate is faster than the maximum transmission rate (see question #2).

Yes, so in the second case, the transmission procedure will detect that not all buffered characters are allowed to be transmitted, when the transmission interval (normally 300 ms) has passed. So they will be kept in a buffer at the sending side. Is that unclear so we need to clarify it? 

[MS] I had to partly reverse-engineer the handling of the buffer from RFC 4103, albeit it is not exactly clear if all considerations therein also apply to this specification. For instance, I believe that your additional explanation could be added to the document to better clarify the intention. And just to illustrate that there could be more ambiguity: To me, Section 5.3 could also imply that characters queued in the sender are dropped from the buffer if they cannot be transmitted within 500ms. That is a relatively small threshold for links with RTT of 200ms or more (e.g., satellite or slow GPRS links), e.g. if retransmissions are needed due to packet loss or congestion/flow control kicks in. Unlike RFC 4103, transport is now reliable and this could have some side effects even if the data rates in text conversations are very small. Out of my head, I would assume that a buffer threshold of 500ms will probably work very well in most parts of the Internet. But there are some areas with bad connectivity, high packet loss rates, etc. I am not sure of the proposed parameters really work well in such corner cases. Typically, such Questions would be answered by measurement data from running code.

[CH] I don't know whether it is within the scope of this document to normatively define that characters shall be dropped if they cannot be transmitted within 500ms. In my opinion that is part of the generic T.140 procedures, not specific to data channels. But, perhaps we could say something like:

       "If a character cannot be transmitted within 500ms, and the sender chooses to drop the character from
         the buffer. If that happens, the sender SHOULD inform the user about it."

---

    > 4/ Also in Section 5.3 is written: "An implementation needs to take the user
    > requirements for smooth flow and low latency in real-time text conversation
    > into consideration when assigning a buffer time.  It is RECOMMENDED to use the
    > default transmission interval of 300 milliseconds [RFC4103], or lower, for
    > T.140 data channels". What is meant here by "or lower"? Does the document want
    > to recommend values much smaller than 300 ms, say, 1 ms? As explained in RFC
    > 4103, this could increase the overhead and bitrate, right? The absolute rate
    > values are relatively small for large parts of today's Internet, but couldn't
    > this text conversation be particularly useful in scenarios with very small
    > capacity of links (i.e., kbps range)?

I suggest to remove the "or lower" part, since the recommendation is to use 300.

Modern applications, especially speech-to-text are better at lower delays. So, having a 300 ms delay may be on the high side. Transmission intervals down to 100 ms will be experienced as an improvement for some applications. The load in both bandwidth and packets per second  is still low compared to what audio and video (often used in the same sessions) cause.  

[MS] Indeed, bandwidth and packet rate will be small compared to other traffic. Nonetheless, this I-D somewhat specifies a service with real-time requirement that runs over a reliable transport channel, which could e.g. traverse highly congested links. Just assuming „the transport will just always work“ is a bit dangerous, IMHO.
It is a RECOMMENDATION,  so, yes, "or lower" could be deleted, but I prefer to leave it.
[MS] Maybe the document could write something like „It is RECOMMENDED to use the default transmission interval of 300 milliseconds [RFC4103] for T.140 data channels. Implementers MAY also use lower values". However, note that RFC 4103 backs the parameter of 300ms by estimating the overhead. If this document allows smaller values, it may have to similarly reason about the impact. 

 [CH] I guess it depends on whether the recommendation is to specifically use 300ms, or whether the recommendation is to use 300 or lower. RFC 4103 talks about specifically using 300ms, so I think we should follow that for T.140 data channels. So, I am fine with the text you suggest.

---

    > 5/ Section 5.4 mandates: "Retransmission of already successfully transmitted
    > T140blocks MUST be avoided, and missing text markers [T140ad1] SHOULD be
    > inserted in the received data stream where loss is detected or suspected." I
    > believe a better wording for the MUST would be "... sucessfully received
    > T140blocks ...", albeit the document does not detail how an implementation can
    > indeed fulfill this MUST. Regarding the SHOULD, I assume that "loss suspected"
    > could be deterrmined by a heuristic. Could such a heuristic fail and result in
    > spurious missing text markers? If so, would a SHOULD be reasonable for that?

Regarding the MUST, T.140 does not provide acknowledgement that T140blocks have been received. It uses a reliable data channel, so as long as the data channel is up and running the sender can only assume that the blocks will be successfully transmitted.

Perhaps Gunnar knows more about how receiving implementations would "suspect" loss?

The requirement from the T.140 presentation level is that the channel shall deliver in order and without duplication.
Possible loss should be indicated. For RFC 4103, there is a slight risk that packets are lost, and if more are lost than 
can be recovered by the redundancy, then a suspected loss has appeared and should be indicated in the text presentation.
There is a chance that something invisible in the text stream was lost. We cannot know.

For the t140-usage case, the situation is different. We have reliable delivery in order as long as not more than
about 7 retries are made and nothing is blocking transmission so that the watchdog breaks the SCTP associations.
But that can happen and the draft says that reestablishment shall be tried. At that moment it may be hard to know from
the sender side what was successfully transmitted, because we do not have any application level checking delivery. What
must be avoided is to retransmit something that might have been received. It is better to let something be lost and insert
an indication that something might have been lost.

[MS] I am not really familiar with internals of the SCTP stack. But to me there is ambiguity in the term „already successfully transmitted
T140blocks“. What does that mean? That the T140block has been copied into the SCTP send buffer? Or that an IP packet containing this
data has been transmitted to the peer? I think the former may not immediately imply the latter, what has exactly happened may be difficult
to know. Maybe it would be safer to avoid the word „successfully transmitted“. For instance, „Retransmission of already sent T140blocks
MUST be avoided“ may work around some of the terminology issues.

That is what the text tries to say.  It might be a good habit to always insert an indicator for suspected loss by the receiver, when the SCTP associations are refreshed. I think the current wording allows that and smarter solutions if possible. It is good to not require the transmitter to insert the loss marker, because that might make it harder to apply security to the transmission chain.

[MS] I was wondering if it is possible that the transmitter erroneously inserts a loss marker even if _no_ data was lost, because the receiver wrongly „suspected“ loss. As far as I understand, the ITU-T T.140 Addendum specifies the loss marker for „discovered data loss“. So, it is not clear to me if using this for _suspected_ loss would be consistent with the ITU-T T.140 Addendum (which I had to read to actually understand this I-D).

[MS] Anyway, I agree that the current specification allows „smarter“ solutions simply by omitting details on (well, granted) corner cases. The question is whether the current spec is precise enough to ensure interoperability between implementations and to fully describe the characteristics to potential users. I don’t understand this use case well enough and therefore I defer that question to the TSV ADs.

[CH] Perhaps we could remove "already successfully transmitted T140blocks", since there currently is no way to verify that. Maybe something like:

OLD:

   "In case of network failure or congestion, T.140 data channels might
   fail and get torn down.  If this happens but the session sustains, it
   is RECOMMENDED that implementations tries to reestablish the T.140
   data channels.  If reestablishment of the T.140 data channel is
   successful, an implementation MUST evaluate if any T140blocks were
   lost.  Retransmission of already successfully transmitted T140blocks
   MUST be avoided, and missing text markers [T140ad1] SHOULD be
   inserted in the received data stream where loss is detected or
   suspected."

NEW:

   "In case of network failure or congestion, T.140 data channels might
   fail and get torn down.  If this happens but the session sustains, it
   is RECOMMENDED that implementations tries to reestablish the T.140
   data channels. As a T.140 data channel does not provide a mechanism
   for the receiver to identify retransmitted T140blocks, the sender MUST
   NOT retransmit T140blocks unless it has strong reasons to suspect that
   a T140block has been lost. Similarly, as a T.140
   data channel does not provide a mechanism for a receiver to detect
   lost T140blocks, it MUST NOT insert missing text markers [T140ad1]
   unless it has strong reasons to suspect that a T140block has been lost.
   Different mechanisms used by senders and receivers to suspect packet loss
   Is outside the scope of this specification."

Regards,

Christer

-- 
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call