Re: Gen-ART Last Call review of draft-ietf-avt-rtp-vorbis-06

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Spencer Dawkins wrote:

> Document: draft-ietf-avt-rtp-vorbis-06
> Reviewer: Spencer Dawkins
> Review Date: 26 Oct 2007 (sorry, late!)
> IETF LC End Date:   22 Oct 2007
> IESG Telechat date:
> 
> Summary: This document is close to ready for publication as a Proposed
> Standard. I have a small number of questions, mostly involving clarity
> or 2119 language. Please take a look at the question in 7.1, especially.
> 
> Comments: I have included "nits" in this review for the convenience of
> authors and editors later in the process. Nits are not part of a Gen-ART
> review.
> 
> 1.  Introduction
> 
>   Vorbis is a general purpose perceptual audio codec intended to allow
>   maximum encoder flexibility, thus allowing it to scale competitively
>   over an exceptionally wide range of bitrates.  At the high quality/
>   bitrate end of the scale (CD or DAT rate stereo, 16/24 bits), it is
>   in the same league as AAC.  Vorbis is also intended for lower and
> 
> Spencer (nit): AAC? has not been expanded previously.

I edited as MPEG-4 AAC, should an informative reference be useful?

> 
>   higher sample rates (from 8kHz telephony to 192kHz digital masters)
>   and a range of channel representations (monaural, polyphonic, stereo,
>   quadraphonic, 5.1, ambisonic, or up to 255 discrete channels).
> 
> 2.2.  Payload Header
> 
>   Vorbis Data Type (VDT): 2 bits
> 
>   This field specifies the kind of Vorbis data stored in this RTP
>   packet.  There are currently three different types of Vorbis
>   payloads.  Each packet MUST contain only a single type of Vorbis
>   payload (e.g. you MUST not aggregate configuration and comment
> 
> Spencer: this is close to a nit, but 2119 language is important. This is
> just restating the previous MUST. I'd suggest either "must" in lower
> case or "MUST NOT" - if there's a reason to have two 2119 requirements
> that say the same thing.

lowercased

> 
>   payload in the same packet)
> 
>      0 = Raw Vorbis payload
>      1 = Vorbis Packed Configuration payload
>      2 = Legacy Vorbis Comment payload
>      3 = Reserved
> 
>   The packets with a VDT of value 3 MUST be ignored
> 
>   The last 4 bits represent the number of complete packets in this
>   payload.  This provides for a maximum number of 15 Vorbis packets in
>   the payload.  If the packet contains fragmented data the number of
>   packets MUST be set to 0.
> 
> Spencer (nit): what type of fragmentation is this? Please provide an
> adjective :-)

Or rephrase as:

"If the F field is nonzero the number of packets MUST be set to 0" ?

> 
> 2.3.  Payload Data
> 
>   The Vorbis packet length header is the length of the Vorbis data
>   block only and does not count the length field.
> 
> Spencer (nit): s/count/include/, I think.

applied

> 
>   The payload packing of the Vorbis data packets MUST follow the
>   guidelines set-out in [3] where the oldest packet occurs immediately
> 
> Spencer: again, adjectives are good. This is saying "the oldest Vorbis
> packet", right? It would be better if the specification doesn't use
> language like "the oldest packet in the packet" with no adjectives -
> that doesn't take us to a good place.

Right

> 
>   after the RTP packet header.  Subsequent packets, if any, MUST follow
>   in temporal order.
> 
> Spencer: "Subsequent Vorbis packets", right?

right

> what does the receiver do if the "follow in temporal order" MUST is violated?

It doesn't have a way to figure out so it will decode the stream
assuming the samples aren't misordered.

> 
> 3.1.1.  Packed Configuration
> 
>   A Vorbis Packed Configuration is indicated with the Vorbis Data Type
>   field set to 1.  Of the three headers defined in the Vorbis I
>   specification [10], the identification and the setup MUST be packed
>   as they are, while the comment header MAY be replaced with a dummy
>   one.  The packed configuration follows a generic way to store xiph
>   codec configurations: The first field stores the number of the
>   following packets minus one (count field), the next ones represent
>   the size of the headers (length fields), the headers immediately
>   follow the list of length fields.  The size of the last header is
>   implicit.  The count and the length fields are encoded using the
>   following logic: the data is in network order, every byte has the
> 
> Spencer (nit): c/network order/network byte order/, and there are
> multiple occurrences in the document...

applied two times.

> 
>   most significant bit used as flag and the following 7 used to store
>   the value.  The first N bit are to be taken, where N is number of
>   bits representing the value modulo 7, and stored in the first byte.
>   If there are more bits, the flag bit is set to 1 and the subsequent
>   7bit are stored in the following byte, if there are remaining bits
>   set the flag to 1 and the same procedure is repeated.  The ending
>   byte has the flag bit set to 0.  In order to decode it is enough to
>   iterate over the bytes until the flag bit set to 0, for every byte
>   the data is added to the accumulated value multiplied by 128.  The
>   headers are packed in the same order they are present in ogg:
>   identification, comment, setup.
> 
> 3.2.  Out of Band Transmission
> 
>   This section, as stated above, does not cover all the possible out-
>   of-band delivery methods since they rely on different protocols and
>   are linked to specific applications.  The following packet definition
>   SHOULD be used in out-of-band delivery and MUST be used when
> 
> Spencer: is there an obvious reason to violate the SHOULD?

If the underlying transport has other syntax providing already the
minimal framing and tagging features required would be an unnecessary
overhead.

> 
>   Configuration is inlined in the SDP.
> 
> 5.1.  Example Fragmented Vorbis Packet
> 
>   The Fragment type field is set to 2 and the number of packets field
>   is set to 0.  For large Vorbis fragments there can be several of
>   these type of payload packets.  The maximum packet size SHOULD be no
> 
> Spencer (nit): s/these type/this type/?

fixed

> 
> Spencer: why is this a SHOULD?

sctp does a better job than fragmenting at rtp level.

> 
>   greater than the path MTU, including all RTP and payload headers.
>   The sequence number has been incremented by one but the timestamp
>   field remains the same as the initial packet.
> 
> 5.2.  Packet Loss
> 
>   As there is no error correction within the Vorbis stream, packet loss
>   will result in a loss of signal.  Packet loss is more of an issue for
>   fragmented Vorbis packets as the client will have to cope with the
>   handling of the Fragment Type.  In case of loss of fragments the
>   client MUST discard all the remaining fragments and decode the
> 
> Spencer (nit) - "remaining Vorbis fragments" and "incomplete Vorbis
> packet"?

right.

> 
>   incomplete packet.  If we use the fragmented Vorbis packet example
>   above and the first packet is lost the client MUST detect that the
> 
> Spencer (nit) - "and the first RTP packet is lost"?

right.

> 
>   next packet has the packet count field set to 0 and the Fragment type
>   2 and MUST drop it.  The next packet, which is the final fragmented
>   packet, MUST be dropped in the same manner.  If the missing packet is
>   the last, the received two fragments will be kept and the incomplete
>   vorbis packet decoded.
> 
> 6.  IANA Considerations
> 
>      configuration-uri:  the URI [4] of the configuration headers in
>         case of out of band transmission.  In the form of
>         "protocol://path/to/resource/", depending on the specific
> 
> Spencer: isn't this "scheme://path/to/resource"?

good call =)

> 
>         method, a single configuration packet could be retrived by its
>         Ident number, or multiple packets could be aggregated in a
>         single stream.  Non hierarchical protocols MAY point to a
>         resource using their specific syntax.
> 
> 7.1.  Mapping Media Type Parameters into SDP
> 
>   The information carried in the Media Type media type specification
>   has a specific mapping to fields in the Session Description Protocol
>   (SDP) [5], which is commonly used to describe RTP sessions.  When SDP
>   is used to specify sessions the mapping are as follows:
> 
> Spencer: is there anything you can say about Receiver behavior when the
> Media Type and SDP don't map correctly?

Should I explicitly state that the Receiver should drop the session?

> 
> 7.1.1.  SDP Example
> 
>   The following example shows a basic SDP single stream.  The first
>   configuration packet is inlined in the sdp, other configurations
> 
> Spencer (nit): it would be great if SDP, URI etc. were consistently
> capitalized.

Done

> 
>   could be fetched at any time from the first provided uri using or all
> 
> Spencer: is this "the URI currently in use"? but the sentence doesn't
> parse as written.

Relic from a previous revision, removed.

> 
>   the known configuration could be downloaded using the second uri.
>   The inline base64 [9] configuration string is trimmed because of the
> 
> Spencer: is this "is folded in this example due to RFC line length
> limitations"?

yes

> 
>   length.
>      c=IN IP4 192.0.2.1
>      m=audio RTP/AVP 98
>      a=rtpmap:98 vorbis/44100/2
>      a=fmtp:98 delivery-method=inline;
>      configuration=AAAAAZ2f4g9NAh4aAXZvcmJpcwA...; delivery-
>      method=out_band; configuration-uri=rtsp://path/to/the/resource;
>      delivery-method=out_band;
>      configuration-uri=http://another/path/to/resource/;
> 
>   Note that the payload format (encoding) names are commonly shown in
>   upper case.  Media Type subtypes are commonly shown in lower case.
>   These names are case-insensitive in both places.  Similarly,
>   parameter names are case-insensitive both in Media Type types and in
>   the default mapping to the SDP a=fmtp attribute.  The exception
>   regarding case sensitivity is the configuration-uri URI which MUST be
>   regarded as being case sensitive.  The a=fmtp line is a single line
> 
> Spencer: "shown as multiple lines in this document for clarity"?
> 

ok

>   even if it is presented broken because of clarity.
> 
> 7.2.  Usage with the SDP Offer/Answer Model
> 
>   The only paramenter negotiable is the delivery method.  All the
> 
> Spencer (nit): c/paramenter negotiable/negotiable parameter/
> 
>   others are declarative: the offer, as described in An Offer/Answer
>   Model Session Description Protocol [8], may contain a large number of
>   delivery methods per single fmtp attribute, the answerer MUST remove
>   every delivery-method and configuration-uri not supported.  All the
>   parameters MUST not be altered on answer otherwise.
> 
> 8.  Congestion Control
> 
>   Vorbis clients SHOULD send regular receiver reports detailing
> 
> Spencer: is there a well-understood definition of "regular" within this
> community?

to my best knowledge it is.

> 
>   congestion.  A mechanism for dynamically downgrading the stream,
>   known as bitrate peeling, will allow for a graceful backing off of
>   the stream bitrate.  This feature is not available at present so an
>   alternative would be to redirect the client to a lower bitrate stream
>   if one is available.
> 
> 9.1.  Stream Radio
> 
>   This is one of the most common situation: one single server streaming
>   content in multicast, the clients may start a session at random time.
>   The content itself could be a mix of live stream, as the wj's voice,
> 
> Spencer (nit): "wj's"? please spell this out (I'm guessing what this means)

webjockey

> 
>   and stored streams as the music she plays.
> 


-- 

Luca Barbato

Gentoo/linux Gentoo/PPC
http://dev.gentoo.org/~lu_zero

_______________________________________________

Ietf@xxxxxxxx
https://www1.ietf.org/mailman/listinfo/ietf

[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Fedora Users]