This is still outstanding, since November. Victor, where are we on this one? Barry On Mon, Nov 25, 2019 at 1:46 AM Benjamin Kaduk <kaduk@xxxxxxx> wrote: > > Hi Victor, > > On Tue, Nov 19, 2019 at 03:14:21PM -0500, victor.demjanenko@xxxxxxxxx wrote: > > Hi Ben, > > > > Sorry I overlooked sending you a response. I would like to address the two > > concerns you have by explaining what the speech coders are doing. > > Thanks for the extra clarifications. To supply one of my own: I'm not > concerned that the protocol doesn't work as implemented, but just want to > make sure that the document includes enough information to admit new > implementations without guesswork. That is to say, "either tell me how to > do it or tell me where to look that tells me how to do it". > > > WRT to 600 bps MELP, there is one TSVCIS mode that uses one bit beyond the > > 54-bit frame for MELP 600 as a frame sync which alternates between frames. > > With two or more MELP 600bps frames in one RTP packet, if any frame > > indicates 600 bps by CODA being 0 and CODB being 1, then we know the stream > > is 600bps. If there is a single frame in an RTP packet, you can still > > deduce this by looking at every other RTP packet (every other MELP 600bps > > frame) and by the timestamp advance. Most likely the two ends would > > negotiate 600 bps in SDP anyways so there really should not be a problem. I > > know it's not pretty but its workable. I hope this explanation helps you > > with the concerns for this issue. > > In this case, the use as an "end-to-end framing bit" (i.e., the alternating > behavior you describe above) is not explicitly stated; one might imagine a > scheme where the framing usage is to have the bit cycle through 1, 1, 0, > and 0, or some other scheme. I'd suggest to note in the document that if > any instance of (CODA, CODB) == (0, 1) is observed, then the 600bps mode is > in use. It might also be helpful to include the observation that two > successive MELPe payloads with CODA == CODB == 0 indicates the 2400bps mode > (and that seeing them in a single RTP packet is decisive, whereas > additional information about packet non-loss would be needed in the > one-MELPe-frame-per-RTP-packet case), but that would be a fair bit of > additional text and might be diminishing returns. (Or, of course, the use > of CODB as an alternating 1/0 bit as the framing usage could be documented > instead.) > > > As for the TSVCIS parameter packing/unpacking, this is really simple. There > > is exactly on three bit parameter, exactly one five bit parameter and a > > variable number of eight bit parameters. In our view, the speech coder > > itself (or a wrapper for it) is responsible for preparing the block of > > octets. RTP then just transports it. On receive, the complementary wrapper > > reverses the packing operation. I hope this clarifies and explains the > > simplicity. > > That's exactly what I expected to happen; however, it's not what I believe > the current text of the document is describing. Specifically, I think that > the current text implies that the "preparing the block of octets" and > "complementary wrapper reverses the packing operation" are supposed to be > part of the RTP payload format that this document describes, but this > document does not have enough information to actually perform those > operations reversibly. If the packing is to be done in the speech coder, > then this document doesn't need to talk about the packing at all (e.g., at > the end of Section 2); if we need to keep the packing/wrapper in this > document then we need to indicate that there's a defined priority order for > the (8-octet) TSVCIS parameters in the TSVCIS references, to allow the > packing/unpacking to be deterministic. > > Thanks, > > Ben > > > > > -----Original Message----- > > From: Benjamin Kaduk <kaduk@xxxxxxx> > > Sent: Thursday, October 31, 2019 8:12 PM > > To: Barry Leiba <barryleiba@xxxxxxxxxxxx> > > Cc: victor.demjanenko@xxxxxxxxx; Roni Even (A) <roni.even@xxxxxxxxxx>; The > > IESG <iesg@xxxxxxxx>; Catherine Meadows <catherine.meadows@xxxxxxxxxxxx>; > > IETF SecDir <secdir@xxxxxxxx>; draft-ietf-payload-tsvcis@xxxxxxxx; Ali Begen > > <ali.begen@networked.media>; avtcore-chairs@xxxxxxxx; avt@xxxxxxxx; Dave > > Satterlee (Vocal) <Dave.Satterlee@xxxxxxxxx>; IETF discussion list > > <ietf@xxxxxxxx>; draft-ietf-payload-tsvcis.all@xxxxxxxx > > Subject: Re: Benjamin Kaduk's Discuss on draft-ietf-payload-tsvcis-03: (with > > DISCUSS and COMMENT) > > > > I don't think so, unfortunately. > > > > I do see the clarification about CODB's potential for deviation from Table > > 1, that only the 600 bps MELPe is allowed to deviate, and that CODA gets us > > to "it's one of 2400 or 600 bps" and the RTP timestamp disambiguates that > > 600 bps is in use. But, it seems that this means that the recipient in > > general should not rely on CODB to differentiate 600 from 2400 bps, and > > instead is more robustly implemented by *always* using the RTP timestamp to > > detect 600 bps, since that will always work and CODB will sometimes not work > > under conditions not fully specified here. So, if we are unwilling or > > unable to clarify what those conditions are (e.g., whether at a minimum > > mutual agreement is required), then I think we need to describe this > > procedure of consulting the RTP timestamp as the default behavior and avoid > > giving the impression that CODB should be used to do so. > > > > Additionally, I don't see anything to address my concern about TSVCIS > > parameter decoding. To be clear, the procedure I see this document > > describing is that: > > - TSVCIS gives parameters (and their lengths in bits) to the codec > > described in this document > > - this document specifies how to densely encode those parameters into a > > byetstream > > - RTP transmits that encoded bytestream to the peer > > - the codec specified by this is responsible for turning that encoded > > bystream back into a list of TSVCIS parameters (and their length in bits) > > > > I don't see how that last step is attainable with only the information > > provided by this document. I *assume* that one of the TSVCIS specifications > > has a canonical (ordered) listing of parameters, and that the list of > > parmeters given to this codec in the first step will always be an initial > > prefix of that list, but that's just me guessing at how to make sense of the > > stated procedure given insufficient information. I don't think it's > > appropriate to make the reader of an RFC guess at what to do; we need to > > either say how to do it or give a pointer to an external reference that > > does. > > > > -Ben > > > > On Tue, Oct 29, 2019 at 02:26:09PM -0400, Barry Leiba wrote: > > > Ben, does the -04 version address everything? > > > > > > Barry > > > > > > On Thu, Oct 24, 2019 at 1:42 PM <victor.demjanenko@xxxxxxxxx> wrote: > > > > > > > > I forgot to address security comments in one email. The changes are: > > > > > > > > Section 8, second paragraph - Suggested edit by reviewer > > > > > > > > (was) > > > > This RTP payload format and the TSVCIS decoder do not exhibit any > > > > significant non-uniformity in the receiver-side computational > > > > complexity for packet processing and thus are unlikely to pose a > > > > denial-of-service threat due to the receipt of pathological data. > > > > Additionally, the RTP payload format does not contain any active > > > > content. > > > > > > > > (now) > > > > This RTP payload format and the TSVCIS decoder, to the best of our > > > > knowledge, do not exhibit any significant non-uniformity in the > > > > receiver-side computational complexity for packet processing and thus > > > > are unlikely to pose a denial-of-service threat due to the receipt of > > > > pathological data. Additionally, the RTP payload format does not > > > > contain any active content. > > > > > > > > > > > > Section 8, third paragraph - Suggested edit by reviewer > > > > > > > > (was) > > > > Please see the security considerations discussed in [RFC6562] > > > > regarding VAD and its effect on bitrates. > > > > > > > > (now) > > > > Please see the security considerations discussed in [RFC6562] > > > > regarding Voice Activity Detect (VAD) and its effect on bitrates. > > > > > > > > Victor > > > > > > > > -----Original Message----- > > > > From: victor.demjanenko@xxxxxxxxx <victor.demjanenko@xxxxxxxxx> > > > > Sent: Thursday, October 24, 2019 10:05 AM > > > > To: 'Roni Even (A)' <roni.even@xxxxxxxxxx>; 'Benjamin Kaduk' > > > > <kaduk@xxxxxxx>; 'The IESG' <iesg@xxxxxxxx> > > > > Cc: draft-ietf-payload-tsvcis@xxxxxxxx; 'Ali Begen' > > > > <ali.begen@networked.media>; avtcore-chairs@xxxxxxxx; avt@xxxxxxxx; > > > > 'Dave Satterlee (Vocal)' <Dave.Satterlee@xxxxxxxxx> > > > > Subject: RE: Benjamin Kaduk's Discuss on > > > > draft-ietf-payload-tsvcis-03: (with DISCUSS and COMMENT) > > > > > > > > Hi Everyone, > > > > > > > > First we want to thank everyone for their review and comments for this > > draft RFC. We believe we reviewed all the comments and suggestions and > > incorporated them adequately in the next draft (04). We'd like to send out > > this list of exact changes in case anyone has additional comments or thinks > > the clarifications are inadequate. We would be most happy to address > > concerns before publishing draft 04 tomorrow. > > > > > > > > With so many emails from a half dozen or more reviewers, we apologize > > that we cannot address each sender individually. We hope this detail is > > sufficient for everyone. > > > > > > > > Again, many thanks to all. > > > > > > > > Victor & Dave > > > > > > > > -------------------------------------------------------------------- > > > > -------------------------- > > > > > > > > Section 1.1 - Suggested reference to RFC 8088 added. > > > > > > > > (was) > > > > Best current practices for writing an RTP payload format > > > > specification were followed [RFC2736]. > > > > > > > > (now) > > > > Best current practices for writing an RTP payload format > > > > specification were followed [RFC2736] [RFC8088]. > > > > > > > > > > > > Section 2, paragraphs 3 and 4 - Suggested edits by reviewers > > > > > > > > (was) > > > > In addition to the augmented speech data, the TSVCIS specification > > > > identifies which speech coder and framing bits are to be encrypted, > > > > and how they are protected by forward error correction (FEC) > > > > techniques (using block codes). At the RTP transport layer, only the > > > > speech coder related bits need to be considered and are conveyed in > > > > unencrypted form. In most IP-based network deployments, standard > > > > link encryption methods (SRTP, VPNs, FIPS 140 link encryptors or Type > > > > 1 Ethernet encryptors) would be used to secure the RTP speech > > > > contents. Further, it is desirable to support the highest voice > > > > quality between endpoints which is only possible without the overhead > > > > of FEC. > > > > > > > > TSVCIS augmented speech data is derived from the signal processing > > > > and data already performed by the MELPe speech coder. For the > > > > purposes of this specification, only the general parameter nature of > > > > TSVCIS will be characterized. Depending on the bandwidth available > > > > (and FEC requirements), a varying number of TSVCIS specific speech > > > > coder parameters need to be transported. These are first byte-packed > > > > and then conveyed from encoder to decoder. > > > > > > > > (now) > > > > In addition to the augmented speech data, the TSVCIS specification > > > > identifies which speech coder and framing bits are to be encrypted, > > > > and how they are protected by forward error correction (FEC) > > > > techniques (using block codes). At the RTP transport layer, only the > > > > speech-coder-related bits need to be considered and are conveyed in > > > > unencrypted form. In most IP-based network deployments, standard > > > > link encryption methods (SRTP, VPNs, FIPS 140 link encryptors or Type > > > > 1 Ethernet encryptors) would be used to secure the RTP speech > > > > contents. > > > > > > > > TSVCIS augmented speech data is derived from the signal processing > > > > and data already performed by the MELPe speech coder. For the > > > > purposes of this specification, only the general parameter nature of > > > > TSVCIS will be characterized. Depending on the bandwidth available > > > > (and FEC requirements), a varying number of TSVCIS-specific speech > > > > coder parameters need to be transported. These are first byte-packed > > > > and then conveyed from encoder to decoder. > > > > > > > > > > > > Section 3, last sentence paragraph 3 - Suggested edit by reviewer > > > > > > > > (was) > > > > When more than one codec data frame is > > > > present in a single RTP packet, the timestamp is, as always, that of > > > > the oldest data frame represented in the RTP packet. > > > > > > > > (now) > > > > When more than one codec data frame is > > > > present in a single RTP packet, the timestamp specified is that of > > > > the oldest data frame represented in the RTP packet. > > > > > > > > > > > > Section 3.1, last paragraph - Clarified permission for MELP 600 > > > > end-to-end framing bit > > > > > > > > (was) > > > > It should be noted that CODB for both the 2400 and 600 bps modes MAY > > > > deviate from the values in Table 1 when bit 55 is used as an end-to- > > > > end framing bit. Frame decoding would remain distinct as CODA being > > > > zero on its own would indicate a 7-byte frame for either rate and the > > > > use of 600 bps speech coding could be deduced from the RTP timestamp > > > > (and anticipated by the SDP negotiations). > > > > > > > > (now) > > > > It should be noted that CODB for MELPe 600 bps mode MAY deviate from > > > > the value in Table 1 when bit 55 is used as an end-to-end framing > > > > bit. Frame decoding would remain distinct as CODA being zero on its > > > > own would indicate a 7-byte frame for either 2400 or 600 bps rate and > > > > the use of 600 bps speech coding could be deduced from the RTP > > > > timestamp (and anticipated by the SDP negotiations). > > > > > > > > > > > > Section 3.2, first paragraph - Clarifications requested by reviewers > > > > > > > > (was) > > > > The TSVCIS augmented speech data as packed parameters MUST be placed > > > > immediately after a corresponding MELPe 2400 bps payload in the same > > > > RTP packet. The packed parameters are counted in octets (TC). In > > > > the preferred placement, shown in Figure 6, a single trailing octet > > > > SHALL be appended to include a two-bit rate code, CODA and CODB, > > > > (both bits set to one) and a six-bit modified count (MTC). The > > > > special modified count value of all ones (representing a MTC value of > > > > 63) SHALL NOT be used for this format as it is used as the indicator > > > > for the alternate packing format shown next. In a standard > > > > implementation, the TSVCIS speech coder uses a minimum of 15 octets > > > > for parameters in octet packed form. The modified count (MTC) MUST > > > > be reduced by 15 from the full octet count (TC). Computed MTC = TC- > > > > 15. This accommodates a maximum of 77 parameter octets (maximum > > > > value of MTC is 62, 77 is the sum of 62+15). > > > > > > > > (now) > > > > The TSVCIS augmented speech data as packed parameters MUST be placed > > > > immediately after a corresponding MELPe 2400 bps payload in the same > > > > RTP packet. The packed parameters are counted in octets (TC). The > > > > preferred placement SHOULD be used for TSVCIS payloads with TC less > > > > than or equal to 77 octets, is shown in Figure 6. In the preferred > > > > placement, a single trailing octet SHALL be appended to include a > > > > two-bit rate code, CODA and CODB, (both bits set to one) and a six- > > > > bit modified count (MTC). The special modified count value of all > > > > ones (representing a MTC value of 63) SHALL NOT be used for this > > > > format as it is used as the indicator for the alternate packing > > > > format shown next. In a standard implementation, the TSVCIS speech > > > > coder uses a minimum of 15 octets for parameters in octet packed > > > > form. The modified count (MTC) MUST be reduced by 15 from the full > > > > octet count (TC). Computed MTC = TC-15. This accommodates a maximum > > > > of 77 parameter octets (maximum value of MTC is 62, 77 is the sum of > > > > 62+15). > > > > > > > > > > > > Section 3.3, first paragraph - Suggested edit by reviewer > > > > > > > > (was) > > > > A TSVCIS RTP packet consists of zero or more TSVCIS coder frames > > > > (each consisting of MELPe and TSVCIS coder data) followed by zero or > > > > one MELPe comfort noise frame. The presence of a comfort noise frame > > > > can be determined by its rate code bits in its last octet. > > > > > > > > (now) > > > > A TSVCIS RTP packet payload consists of zero or more consecutive > > > > TSVCIS coder frames (each consisting of MELPe 2400 and TSVCIS coder > > > > data), with the oldest frame first, followed by zero or one MELPe > > > > comfort noise frame. The presence of a comfort noise frame can be > > > > determined by its rate code bits in its last octet. > > > > > > > > > > > > Section 3.3, fourth paragraph - Clarification requested by reviewers > > > > > > > > (was) > > > > TSVCIS coder frames in a single RTP packet MAY be of different coder > > > > bitrates. With the exception for the variable length TSVCIS > > > > parameter frames, the coder rate bits in the trailing byte identify > > > > the contents and length as per Table 1. > > > > > > > > (now) > > > > TSVCIS coder frames in a single RTP packet MAY have varying TSVCIS > > > > parameter octet counts. Its packed parameter octet count (length) is > > > > indicated in the trailing byte(s). All MELPe frames in a single RTP > > > > packet MUST be of the same coder bitrate. For all MELPe coder > > > > frames, the coder rate bits in the trailing byte identify the > > > > contents and length as per Table 1. > > > > > > > > > > > > Section 4.1 - Editor note removed > > > > > > > > > > > > Section 4.1 - Change controller is now > > > > > > > > (now) > > > > Change controller: IETF, contact <avt@xxxxxxxx> > > > > > > > > > > > > Section 5, first paragraph - Suggested edits by reviewers > > > > > > > > (was) > > > > A primary application of TSVCIS is for radio communications of voice > > > > conversations, and discontinuous transmissions are normal. When > > > > TSVCIS is used in an IP network, TSVCIS RTP packet transmissions may > > > > cease and resume frequently. RTP synchronization source (SSRC) > > > > sequence number gaps indicate lost packets to be filled by PLC, while > > > > abrupt loss of RTP packets indicates intended discontinuous > > > > transmissions. > > > > > > > > (now) > > > > A primary application of TSVCIS is for radio communications of voice > > > > conversations, and discontinuous transmissions are normal. When > > > > TSVCIS is used in an IP network, TSVCIS RTP packet transmissions may > > > > cease and resume frequently. RTP synchronization source (SSRC) > > > > sequence number gaps indicate lost packets to be filled by Packet > > > > Loss Concealment (PLC), while abrupt loss of RTP packets indicates > > > > intended discontinuous transmissions. Resumption of voice > > > > transmission SHOULD be indicated by the RTP marker bit (M) set to 1. > > > > > > > > > > > > Section 10 - Added reference > > > > > > > > (added) > > > > [RFC8088] Westerlund, M., "How to Write an RTP Payload Format", > > > > RFC 8088, DOI 10.17487/RFC8088, May 2017, > > > > <http://www.rfc-editor.org/info/rfc8088>. > > > > > > > > -------------------------------------------------------------------- > > > > ----------------------------- > > > > > > > > > > > > -----Original Message----- > > > > From: Roni Even (A) <roni.even@xxxxxxxxxx> > > > > Sent: Sunday, October 6, 2019 2:09 AM > > > > To: victor.demjanenko@xxxxxxxxx; 'Benjamin Kaduk' <kaduk@xxxxxxx>; > > > > 'The IESG' <iesg@xxxxxxxx> > > > > Cc: draft-ietf-payload-tsvcis@xxxxxxxx; 'Ali Begen' > > > > <ali.begen@networked.media>; avtcore-chairs@xxxxxxxx; avt@xxxxxxxx; > > > > 'Dave Satterlee (Vocal)' <Dave.Satterlee@xxxxxxxxx> > > > > Subject: RE: Benjamin Kaduk's Discuss on > > > > draft-ietf-payload-tsvcis-03: (with DISCUSS and COMMENT) > > > > > > > > Hi, > > > > About the reference to TSVCIS. > > > > The RTP payload is about how to encapsulate the payload in an RTP > > packet. The objective is to define how an RTP stack can insert the tsvcis > > frames and extract the tsvcis frames from the RTP packet. Typically it is > > not required to understand the payload structure in order to be able to > > perform the encapsulation. > > > > This is why the reference to the payload is Informational and we did > > > > not require to have it publically available. If there is a need to > > > > understand the payload itself for the encapsulating than we need > > > > more information in the RTP payload specification and a publically > > > > available normative reference. I think this is not the case here > > > > > > > > Roni Even > > > > > > > > AVTCore co-chair (ex Payload) > > > > > > > > -----Original Message----- > > > > From: victor.demjanenko@xxxxxxxxx > > > > [mailto:victor.demjanenko@xxxxxxxxx] > > > > Sent: Saturday, October 05, 2019 12:18 AM > > > > To: 'Benjamin Kaduk'; 'The IESG' > > > > Cc: draft-ietf-payload-tsvcis@xxxxxxxx; 'Ali Begen'; > > avtcore-chairs@xxxxxxxx; avt@xxxxxxxx; 'Victor Demjanenko, Ph.D.'; 'Dave > > Satterlee (Vocal)' > > > > Subject: RE: Benjamin Kaduk's Discuss on > > > > draft-ietf-payload-tsvcis-03: (with DISCUSS and COMMENT) > > > > > > > > Everyone, > > > > > > > > Thanks for the comments. I think I mis-understood the ambiguity with > > respect to to changing rates within a RTP packet. That was not plan. An > > RTP packet must have MELP speech frames of the same rate. What is possible > > is that the amount of augmented TSVCIS speech data may vary from one speech > > frame to the next. This allows for a dynamic VDR as suggested by the NRL > > paper. So an RTP packet may have varying TSVCIS data but must always have > > MELPe 2400 data. > > > > > > > > Again backwards parsing is necessary but the timestamp uniformly > > increments 22.5msec per combined MELP/TSVCIS speech frame. > > > > > > > > The NRL is a good public reference on the VDR aspects. The actual > > TSVCIS spec we had was FOUO so we could not replicate its detail. (I > > believe a later spec is public or at least partially public. I am trying to > > get this.) The opaque data is pretty obvious with the TSVCIS spec in hand. > > > > > > > > We will address the issues/concerns raised next week. Other business > > had priority. > > > > > > > > Thank you and enjoy the weekend. > > > > > > > > Regards, > > > > > > > > Victor & Dave > > > > > > > > -----Original Message----- > > > > From: Benjamin Kaduk via Datatracker <noreply@xxxxxxxx> > > > > Sent: Wednesday, October 2, 2019 10:40 PM > > > > To: The IESG <iesg@xxxxxxxx> > > > > Cc: draft-ietf-payload-tsvcis@xxxxxxxx; Ali Begen > > > > <ali.begen@networked.media>; avtcore-chairs@xxxxxxxx; > > > > ali.begen@networked.media; avt@xxxxxxxx > > > > Subject: Benjamin Kaduk's Discuss on draft-ietf-payload-tsvcis-03: > > > > (with DISCUSS and COMMENT) > > > > > > > > Benjamin Kaduk has entered the following ballot position for > > > > draft-ietf-payload-tsvcis-03: Discuss > > > > > > > > When responding, please keep the subject line intact and reply to > > > > all email addresses included in the To and CC lines. (Feel free to > > > > cut this introductory paragraph, however.) > > > > > > > > > > > > Please refer to > > > > https://www.ietf.org/iesg/statement/discuss-criteria.html > > > > for more information about IESG DISCUSS and COMMENT positions. > > > > > > > > > > > > The document, along with other ballot positions, can be found here: > > > > https://datatracker.ietf.org/doc/draft-ietf-payload-tsvcis/ > > > > > > > > > > > > > > > > -------------------------------------------------------------------- > > > > -- > > > > DISCUSS: > > > > -------------------------------------------------------------------- > > > > -- > > > > > > > > I support Magnus' point about the time-ordering of adjacent frames in a > > packet. > > > > > > > > Additionally, I am not sure that there's quite enough here to be > > interoperably implementable. Specifically, we seem to be lacking a > > description of how an encoder or decoder knows which TSVCIS parameters, and > > in what order, to byte-pack or unpack, respectively. One might surmise that > > there is a canonical listing in [TSVCIS], but this document does not say > > that, and furthermore [TSVCIS] is only listed as an informative reference. > > (I couldn't get my hands on my copy, at least on short notice.) If we > > limited ourselves to treating the TSVCIS parameters as an entirely opaque > > blob (codec, convey these N octets to the peer with the appropriate one- or > > two-byte trailer for payload type identification and framing), that would be > > interoperably implementable, since the black-box bits are up to some other > > codec to interpret. > > > > > > > > In a similar vein, we mention but do not completely specify the > > potential for using CODB as an end-to-end framing bit, in Section 3.1 (see > > Comment), which is not interoperably implementable without further details. > > > > > > > > > > > > -------------------------------------------------------------------- > > > > -- > > > > COMMENT: > > > > -------------------------------------------------------------------- > > > > -- > > > > > > > > Where is [TSVCIS] available? > > > > > > > > Is [NRLVDR] the same as > > > > https://apps.dtic.mil/dtic/tr/fulltext/u2/a588068.pdf ? A URL in the > > references would be helpful. > > > > > > > > Is additional TSVCIS data only present after 2400bps MELPe and the first > > thing to get dropped under bandwidth pressure? The abstract and > > introduction imply this by calling out MELPe 2400 bps speech parameters > > explicitly, but Section 3 says that TSVCIS augments standard 600, 1200, and > > 2400 bps MELP frames. > > > > > > > > It's helpful that Section 3.3 gives some general guidance for decoding > > this payload type ("[t]he way to determine the number of TSVCIS/MELPe frames > > is to identify each frame type and length"), but I think some generic > > considerations would be very helpful to the reader much earlier, along the > > lines of "MELPe and TSVCIS data payloads are decoded from the end, using the > > CODA and CODB (and, if necessary, CODC and others) bits to determine the > > type of payload. For MELPe payloads the type also indicates the payload > > length, whereas for TSVCIS data an additional length field is present, in > > one of two possible formats. A TSVCIS coder frame consists of a MELPe data > > payload followed by zero or one TSVCIS data payload; after the TSVCIS > > payload's presence/length is determined, then the preceding MELPe payload > > can be determined and decoded. Per Section 3.3, multiple TSVCIS frames can > > be present in a single RTP packet." This (or something like it) would also > > serve to clarify the role of the COD* bits, which is otherwise only > > implicitly introduced. > > > > > > > > Section 1.1 > > > > > > > > RFC 2736 is BCP 36 (but it's updated by RFC 8088 which is for some > > reason an Informational document and not part of BCP 36?!). > > > > > > > > Section 2 > > > > > > > > In addition to the augmented speech data, the TSVCIS specification > > > > identifies which speech coder and framing bits are to be encrypted, > > > > and how they are protected by forward error correction (FEC) > > > > techniques (using block codes). At the RTP transport layer, only the > > > > speech coder related bits need to be considered and are conveyed in > > > > unencrypted form. In most IP-based network deployments, standard > > > > > > > > Am I reading this correctly that this text is just summarizing what's in > > the TSVCIS spec in terms of what needs to be in unencrypted form, so the > > "only the speech coder related bits[...]" is not new information from this > > document? I'm not sure I agree with the conclusion, regardless -- won't the > > (MELPe) speech coder bits be enough to convey the semantic content of the > > audio stream, something that one might desire to keep confidential? > > > > > > > > link encryption methods (SRTP, VPNs, FIPS 140 link encryptors or Type > > > > 1 Ethernet encryptors) would be used to secure the RTP speech > > > > contents. Further, it is desirable to support the highest voice > > > > quality between endpoints which is only possible without the overhead > > > > of FEC. > > > > > > > > I think I'm missing a step in how this conclusion was reached. > > > > > > > > TSVCIS will be characterized. Depending on the bandwidth available > > > > (and FEC requirements), a varying number of TSVCIS specific speech > > > > coder parameters need to be transported. These are first byte-packed > > > > and then conveyed from encoder to decoder. > > > > > > > > Per the Discuss point, how do I know which parameters need to be > > transported, and in what order? > > > > > > > > Byte packing of TSVCIS speech data into packed parameters is > > > > processed as per the following example: > > > > > > > > Three-bit field: bits A, B, and C (A is MSB, C is LSB) > > > > Five-bit field: bits D, E, F, G, and H (D is MSB, H is LSB) > > > > > > > > MSB LSB > > > > 0 1 2 3 4 5 6 7 > > > > +------+------+------+------+------+------+------+------+ > > > > | H | G | F | E | D | C | B | A | > > > > +------+------+------+------+------+------+------+------+ > > > > > > > > This packing method places the three-bit field "first" in the lowest > > > > bits followed by the next five-bit field. Parameters may be split > > > > between octets with the most significant bits in the earlier octet. > > > > Any unfilled bits in the last octet MUST be filled with zero. > > > > > > > > I agree with Adam that this is very unclear. A is the MSB of the > > three-bit field but the LSB of the octet overall? > > > > We probably need an example of splitting a parameter across octets as > > well, to get the bit ordering right. > > > > > > > > Section 3.1 > > > > > > > > It should be noted that CODB for both the 2400 and 600 bps modes MAY > > > > deviate from the values in Table 1 when bit 55 is used as an end-to- > > > > end framing bit. Frame decoding would remain distinct as CODA > > > > being > > > > > > > > Where is the use of CODB as an end-to-end framing bit defined? If we're > > going to provide neither a complete description of how to do it nor a > > reference to a better description, we probably shouldn't mention it at all. > > > > > > > > Section 3.2 > > > > > > > > RTP packet. The packed parameters are counted in octets (TC). In > > > > the preferred placement, shown in Figure 6, a single trailing octet > > > > SHALL be appended to include a two-bit rate code, CODA and CODB, > > > > > > > > I'd consider saying something about this being the preferred format > > > > ("placement") due to its shorter length than the alternative, and say > > that it "SHOULD be used for TSVCIS payloads with TC less than or equal to 77 > > octetes". > > > > > > > > Section 3.3 > > > > > > > > When a longer packetization interval is used, is that indicated by > > signaling or RTP timestamps or otherwise? > > > > > > > > TSVCIS coder frames in a single RTP packet MAY be of different coder > > > > bitrates. With the exception for the variable length TSVCIS > > > > parameter frames, the coder rate bits in the trailing byte identify > > > > the contents and length as per Table 1. > > > > > > > > Maybe also note that the penultimate octet gives the length there? > > > > > > > > Information describing the number of frames contained in an RTP > > > > packet is not transmitted as part of the RTP payload. The way to > > > > determine the number of TSVCIS/MELPe frames is to identify each frame > > > > type and length thereby counting the total number of octets within > > > > the RTP packet. > > > > > > > > terminology nit: if a frame is the combination of MELPe and TSVCIS > > payload data units then there are two layres of decoding to get a length for > > the frame, since we have to get the TSVCIS length and then the MELPe length. > > > > > > > > Section 4.2 > > > > > > > > Parameter "ptime" cannot be used for the purpose of specifying > > > > the > > > > > > > > nit: missing article ("The parameter") > > > > > > > > will be impossible to distinguish which mode is about to be used > > > > (e.g., when ptime=68, it would be impossible to distinguish if the > > > > packet is carrying one frame of 67.5 ms or three frames of 22.5 ms). > > > > > > > > So how is the operating mode determined, then? > > > > (I think this is the same question I asked above) > > > > > > > > Section 4.4 > > > > > > > > For example, if offerer bitrates are "2400,600" and answer bitrates > > > > are "600,2400", the initial bitrate is 600. If other bitrates are > > > > provided by the answerer, any common bitrate between the offer and > > > > answer MAY be used at any time in the future. Activation of these > > > > other common bitrates is beyond the scope of this document. > > > > > > > > It seems important to specify whether this requires a new O/A exchange > > or can be done "spontaneously" by just encoding different frame types. > > > > (It seems like the latter is possible, on first glance, and this is > > > > implied by Section 3.3's discussion of mixing them in a single > > > > packet.) > > > > > > > > Section 5 > > > > > > > > Please expand PLC at first use (not second). > > > > > > > > Section 6 > > > > > > > > I don't understand the PLC usage. Is the idea that a receiver, on > > seeing an SSRC gap, constructs fictitious PLC frames to "fill the gap" > > > > and passes the resulting stream to the decoder? > > > > > > > > Section 8 > > > > > > > > and important considerations in [RFC7201]. Applications SHOULD use > > > > one or more appropriate strong security mechanisms. The rest of this > > > > section discusses the security-impacting properties of the payload > > > > format itself. > > > > > > > > I thought we described TSVCIS itself (much earlier in the document) as > > requiring encryption for some data; wouldn't that translate to a "MUST" > > > > here and not a "SHOULD"? > > > > > > > > > > > > > >