Spencer Dawkins wrote: > Document: draft-ietf-avt-rtp-vorbis-06 > Reviewer: Spencer Dawkins > Review Date: 26 Oct 2007 (sorry, late!) > IETF LC End Date: 22 Oct 2007 > IESG Telechat date: > > Summary: This document is close to ready for publication as a Proposed > Standard. I have a small number of questions, mostly involving clarity > or 2119 language. Please take a look at the question in 7.1, especially. > > Comments: I have included "nits" in this review for the convenience of > authors and editors later in the process. Nits are not part of a Gen-ART > review. > > 1. Introduction > > Vorbis is a general purpose perceptual audio codec intended to allow > maximum encoder flexibility, thus allowing it to scale competitively > over an exceptionally wide range of bitrates. At the high quality/ > bitrate end of the scale (CD or DAT rate stereo, 16/24 bits), it is > in the same league as AAC. Vorbis is also intended for lower and > > Spencer (nit): AAC? has not been expanded previously. I edited as MPEG-4 AAC, should an informative reference be useful? > > higher sample rates (from 8kHz telephony to 192kHz digital masters) > and a range of channel representations (monaural, polyphonic, stereo, > quadraphonic, 5.1, ambisonic, or up to 255 discrete channels). > > 2.2. Payload Header > > Vorbis Data Type (VDT): 2 bits > > This field specifies the kind of Vorbis data stored in this RTP > packet. There are currently three different types of Vorbis > payloads. Each packet MUST contain only a single type of Vorbis > payload (e.g. you MUST not aggregate configuration and comment > > Spencer: this is close to a nit, but 2119 language is important. This is > just restating the previous MUST. I'd suggest either "must" in lower > case or "MUST NOT" - if there's a reason to have two 2119 requirements > that say the same thing. lowercased > > payload in the same packet) > > 0 = Raw Vorbis payload > 1 = Vorbis Packed Configuration payload > 2 = Legacy Vorbis Comment payload > 3 = Reserved > > The packets with a VDT of value 3 MUST be ignored > > The last 4 bits represent the number of complete packets in this > payload. This provides for a maximum number of 15 Vorbis packets in > the payload. If the packet contains fragmented data the number of > packets MUST be set to 0. > > Spencer (nit): what type of fragmentation is this? Please provide an > adjective :-) Or rephrase as: "If the F field is nonzero the number of packets MUST be set to 0" ? > > 2.3. Payload Data > > The Vorbis packet length header is the length of the Vorbis data > block only and does not count the length field. > > Spencer (nit): s/count/include/, I think. applied > > The payload packing of the Vorbis data packets MUST follow the > guidelines set-out in [3] where the oldest packet occurs immediately > > Spencer: again, adjectives are good. This is saying "the oldest Vorbis > packet", right? It would be better if the specification doesn't use > language like "the oldest packet in the packet" with no adjectives - > that doesn't take us to a good place. Right > > after the RTP packet header. Subsequent packets, if any, MUST follow > in temporal order. > > Spencer: "Subsequent Vorbis packets", right? right > what does the receiver do if the "follow in temporal order" MUST is violated? It doesn't have a way to figure out so it will decode the stream assuming the samples aren't misordered. > > 3.1.1. Packed Configuration > > A Vorbis Packed Configuration is indicated with the Vorbis Data Type > field set to 1. Of the three headers defined in the Vorbis I > specification [10], the identification and the setup MUST be packed > as they are, while the comment header MAY be replaced with a dummy > one. The packed configuration follows a generic way to store xiph > codec configurations: The first field stores the number of the > following packets minus one (count field), the next ones represent > the size of the headers (length fields), the headers immediately > follow the list of length fields. The size of the last header is > implicit. The count and the length fields are encoded using the > following logic: the data is in network order, every byte has the > > Spencer (nit): c/network order/network byte order/, and there are > multiple occurrences in the document... applied two times. > > most significant bit used as flag and the following 7 used to store > the value. The first N bit are to be taken, where N is number of > bits representing the value modulo 7, and stored in the first byte. > If there are more bits, the flag bit is set to 1 and the subsequent > 7bit are stored in the following byte, if there are remaining bits > set the flag to 1 and the same procedure is repeated. The ending > byte has the flag bit set to 0. In order to decode it is enough to > iterate over the bytes until the flag bit set to 0, for every byte > the data is added to the accumulated value multiplied by 128. The > headers are packed in the same order they are present in ogg: > identification, comment, setup. > > 3.2. Out of Band Transmission > > This section, as stated above, does not cover all the possible out- > of-band delivery methods since they rely on different protocols and > are linked to specific applications. The following packet definition > SHOULD be used in out-of-band delivery and MUST be used when > > Spencer: is there an obvious reason to violate the SHOULD? If the underlying transport has other syntax providing already the minimal framing and tagging features required would be an unnecessary overhead. > > Configuration is inlined in the SDP. > > 5.1. Example Fragmented Vorbis Packet > > The Fragment type field is set to 2 and the number of packets field > is set to 0. For large Vorbis fragments there can be several of > these type of payload packets. The maximum packet size SHOULD be no > > Spencer (nit): s/these type/this type/? fixed > > Spencer: why is this a SHOULD? sctp does a better job than fragmenting at rtp level. > > greater than the path MTU, including all RTP and payload headers. > The sequence number has been incremented by one but the timestamp > field remains the same as the initial packet. > > 5.2. Packet Loss > > As there is no error correction within the Vorbis stream, packet loss > will result in a loss of signal. Packet loss is more of an issue for > fragmented Vorbis packets as the client will have to cope with the > handling of the Fragment Type. In case of loss of fragments the > client MUST discard all the remaining fragments and decode the > > Spencer (nit) - "remaining Vorbis fragments" and "incomplete Vorbis > packet"? right. > > incomplete packet. If we use the fragmented Vorbis packet example > above and the first packet is lost the client MUST detect that the > > Spencer (nit) - "and the first RTP packet is lost"? right. > > next packet has the packet count field set to 0 and the Fragment type > 2 and MUST drop it. The next packet, which is the final fragmented > packet, MUST be dropped in the same manner. If the missing packet is > the last, the received two fragments will be kept and the incomplete > vorbis packet decoded. > > 6. IANA Considerations > > configuration-uri: the URI [4] of the configuration headers in > case of out of band transmission. In the form of > "protocol://path/to/resource/", depending on the specific > > Spencer: isn't this "scheme://path/to/resource"? good call =) > > method, a single configuration packet could be retrived by its > Ident number, or multiple packets could be aggregated in a > single stream. Non hierarchical protocols MAY point to a > resource using their specific syntax. > > 7.1. Mapping Media Type Parameters into SDP > > The information carried in the Media Type media type specification > has a specific mapping to fields in the Session Description Protocol > (SDP) [5], which is commonly used to describe RTP sessions. When SDP > is used to specify sessions the mapping are as follows: > > Spencer: is there anything you can say about Receiver behavior when the > Media Type and SDP don't map correctly? Should I explicitly state that the Receiver should drop the session? > > 7.1.1. SDP Example > > The following example shows a basic SDP single stream. The first > configuration packet is inlined in the sdp, other configurations > > Spencer (nit): it would be great if SDP, URI etc. were consistently > capitalized. Done > > could be fetched at any time from the first provided uri using or all > > Spencer: is this "the URI currently in use"? but the sentence doesn't > parse as written. Relic from a previous revision, removed. > > the known configuration could be downloaded using the second uri. > The inline base64 [9] configuration string is trimmed because of the > > Spencer: is this "is folded in this example due to RFC line length > limitations"? yes > > length. > c=IN IP4 192.0.2.1 > m=audio RTP/AVP 98 > a=rtpmap:98 vorbis/44100/2 > a=fmtp:98 delivery-method=inline; > configuration=AAAAAZ2f4g9NAh4aAXZvcmJpcwA...; delivery- > method=out_band; configuration-uri=rtsp://path/to/the/resource; > delivery-method=out_band; > configuration-uri=http://another/path/to/resource/; > > Note that the payload format (encoding) names are commonly shown in > upper case. Media Type subtypes are commonly shown in lower case. > These names are case-insensitive in both places. Similarly, > parameter names are case-insensitive both in Media Type types and in > the default mapping to the SDP a=fmtp attribute. The exception > regarding case sensitivity is the configuration-uri URI which MUST be > regarded as being case sensitive. The a=fmtp line is a single line > > Spencer: "shown as multiple lines in this document for clarity"? > ok > even if it is presented broken because of clarity. > > 7.2. Usage with the SDP Offer/Answer Model > > The only paramenter negotiable is the delivery method. All the > > Spencer (nit): c/paramenter negotiable/negotiable parameter/ > > others are declarative: the offer, as described in An Offer/Answer > Model Session Description Protocol [8], may contain a large number of > delivery methods per single fmtp attribute, the answerer MUST remove > every delivery-method and configuration-uri not supported. All the > parameters MUST not be altered on answer otherwise. > > 8. Congestion Control > > Vorbis clients SHOULD send regular receiver reports detailing > > Spencer: is there a well-understood definition of "regular" within this > community? to my best knowledge it is. > > congestion. A mechanism for dynamically downgrading the stream, > known as bitrate peeling, will allow for a graceful backing off of > the stream bitrate. This feature is not available at present so an > alternative would be to redirect the client to a lower bitrate stream > if one is available. > > 9.1. Stream Radio > > This is one of the most common situation: one single server streaming > content in multicast, the clients may start a session at random time. > The content itself could be a mix of live stream, as the wj's voice, > > Spencer (nit): "wj's"? please spell this out (I'm guessing what this means) webjockey > > and stored streams as the music she plays. > -- Luca Barbato Gentoo/linux Gentoo/PPC http://dev.gentoo.org/~lu_zero _______________________________________________ Ietf@xxxxxxxx https://www1.ietf.org/mailman/listinfo/ietf