Unfortunately this is vacation time in Scandinavia - won't do anything with this until mid-August, I'm afraid. (Henrik might.) So it'll have to wait another 2 telechats or so..... On 07/29/2015 11:58 PM, Ben Campbell wrote: > Hi VP8 Authors: > > Has there been a response to Elwyn's Gen-ART review from the authors? > If so, I haven't seen it. (I did see Colin's response). > > Based Elwyn's review, and on Harald's response to my earlier > questions, I assume that the authors will wish to update the draft > before it goes back to an IESG telechat. If that's not the case, > please let me know. Please be advised that tomorrow (July 30) is the > nominal deadline for getting things on the August 6 telechat. > > Thanks! > > Ben. > > > On 13 Jul 2015, at 15:57, Ben Campbell wrote: > >> (+payload and Harald) >> >> Elwyn: Thanks for the detailed review! >> >> Authors: Please address Elwyn's comments at your earliest >> convenience. There's quite a bit here, and I assume we will need a >> new revision before taking this to the IESG telechat, but it would be >> helpful to discuss any proposed changes via email before submitting. >> >> Thanks! >> >> Ben. >> >> >> On 13 Jul 2015, at 6:58, Elwyn Davies wrote: >> >>> I am the assigned Gen-ART reviewer for this draft. For background on >>> Gen-ART, please see the FAQ at >>> >>> <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>. >>> >>> Please resolve these comments along with any other Last Call comments >>> you may receive. >>> >>> Document: draft-ietf-payload-vp8-16.txt >>> Reviewer: Elwyn Davies >>> Review Date: 2015/07/10 >>> IETF LC End Date: 2015/07/13 >>> IESG Telechat date: (if known) - >>> >>> Summary: Gosh, it has been a long time since the last review! >>> >>> Major issues: >>> >>> Minor issues: >>> s4.2: Various of the frame indices either SHOULD or, in the case of >>> KEYIDX, MAY start at random values. The rationale(s) for these >>> requirements are not explained - nor is the reason why it it is only a >>> MAY for KEYIDX whereas it is SHOULD for PictureID and what might happen >>> were the MAY/SHOULD to be ignored. >>> >>> Is the rationale something that should be mentioned in the security >>> considerations? >>> >>> s10.1: The downref to RFC 6386 identified by idnits was duly noted >>> in the last call >>> announcement. I don't have a problem with the downref. >>> >>> Nits/editorial comments: >>> >>> General: Consistent usage of "prediction and mode" vs >>> "prediction/mode" would be desirable. >>> >>> s1, para 2: Needs to introduce the 'macroblock' jargon defined in >>> RFC 6386. Suggest: >>> OLD: >>> VP8 is based on decomposition of frames into square sub-blocks of >>> pixels, prediction of such sub-blocks using previously constructed >>> blocks, and... >>> NEW: >>> VP8 is based on decomposition of frames into square sub-blocks of >>> pixels known as "macroblocks" (see Section 2 of [RFC6386]). Prediction >>> of such sub-blocks using previously constructed blocks, and... >>> END >>> >>> s2: There are a number of technical terms brought over from RFC 6386. >>> These should be listed in s2. At least the following would be in >>> the list: >>> key frame, interframe, golden frame, altref frame, macroblock. >>> Also the terms picture selection index (RPSI) and slice loss >>> indication (SLI) from RFC 4585. >>> >>> s3, para 2: Providing a reference to explain what temporal scalability >>> and the hierarchy selection is all about would be useful. One >>> possibility >>> would be: >>> >>> Lim, J. Yang, and B. Jeon,”Fast Coding Mode Decision for Scalable >>> Video Coding”, >>> 10th Int’l Conf. on Advanced Communication Technology, vol. 3, pp. >>> 1897-1900, 2008. >>> >>> It would also help to add a pointer to the TL0PICIDX description in >>> s4.2, thus: >>> ADD: >>> Support for temporal scalability is provided by the optional >>> TL0PICIDX and TID/Y/KEYIDX >>> fields described in Section 4.2. >>> END >>> >>> ss3 and 4: The discussion of how the frame payload might or might >>> not be distributed across >>> multiple packets is not very clear. The draft has in mind a >>> 'preferred use case' (para 1 >>> of s4.4 says: >>> >>>> In the preferred use case, the S bit and PID fields >>>> described in Section 4.2 should be used to indicate what the packet >>>> contains. >>> >>> I think it would be helpful to be a bit more positive about this in >>> s3, para 3: >>> OLD: >>> This memo >>> allows the partitions to be sent separately or in the same RTP >>> packet. It may be beneficial for decoder error-concealment to send >>> the partitions in different packets, even though it is not mandatory >>> according to this specification. >>> NEW: >>> Accordingly, this document RECOMMENDS that the frame is packetized >>> by the sender >>> with each data partition in a separate packet or packets. This may >>> be beneficial >>> for decoder error concealment and the payload format described in >>> Section 4 provides >>> fields that allow the partitions to be identified even if the first >>> partition is >>> not available. The sender can, alternatively, aggregate the data >>> partitions into >>> a single data stream and, optionally, split it into several packets >>> without >>> consideration of the partition boundaries. The receiver can use the >>> length >>> information in the first partition to identify the partitions during >>> decoding. >>> END >>> >>> s4/Figure 1: s/bytes/octets/ (2 places) >>> >>> Figure 1, caption: s/sequel/Sections 4.2 and 4.3/ >>> >>> Figure 1: The format shown is correct only for the first packet in a >>> frame. Subsequent >>> packets will not contain the payload header and will have later >>> octets of the frame payload. >>> This should be mentioned in drawing and/or in the caption. >>> >>> s4.1, last two paras: >>>> Sequence number: The sequence numbers are monotonically increasing >>>> and set as packets are sent. >>>> >>>> The remaining RTP header fields are used as specified in >>>> [RFC3550]. >>> Redefining the Sequence Number field seems gratuitous and >>> potentially dangerous, >>> given that it is *very carefully* defined in RFC 3550 and the >>> overall intent does >>> not seem any different from that in RFC 3550. OTOH a note about the >>> (non?-)use of the X bit >>> and the Payload Type field (PT) would be useful. I suggest >>> replacing the paras above with: >>> NEW: >>> X bit: MUST be 0. The VP8 RTP profile does not use the RTP Header >>> Extension capability. >>> >>> PT (Payload Type): In line with the policy in Section 3 of >>> [RFC3551], applications >>> using the VP8 RTP payload profile MUST assign a dynamic payload type >>> number to >>> be used in each RTP session and provide a mechanism to indicate the >>> mapping. >>> See Section 6.2 for the mechanism to be used with the Session >>> Description Protocol (SDP) >>> [RFC4566]. >>> >>> The remaining RTP Fixed Header Fields (V, P, CC, sequence number, >>> SSRC and CSRC >>> identfiers) are used as specified in Section 5.1 of [RFC5330]. >>> END >>> Note that this may be considered to make the reference to RFC 3551 >>> normative. >>> >>> s4.2, X bit: There is potential for confusion between the X bit in >>> the fixed header >>> and the X bit in the VP8 Payload Descriptor. Perhaps changing this >>> to C for control bits >>> would avoid the problem. >>> >>> s4.2, M bit: Similarly, it might be better to use B (for BIG) in >>> place of M to avoid confusion. >>> >>> s4.2, para 7 after Fig 2: For consistency: >>> OLD: >>> When the X bit is set to 1 in the first octet, the extension bit >>> field octet MUST be provided as the second octet. If the X bit is 0, >>> the extension bit field octet MUST NOT be present, and no extensions >>> (I, L, T, or K) are permitted. >>> NEW: >>> When the X [or C] bit is set to 1 in the first octet, the Extended >>> Control Bits >>> field octet MUST be provided as the second octet. If the X [or C] >>> bit is 0, >>> the extension bit field octet MUST NOT be present, and no extensions >>> (I, L, T, or K) are permitted. >>> END >>> >>> s4.2: The issue of using either 'wrap' or 'restart' but not both for >>> restarting number sequences has been raised in various comments by >>> IESG members. >>> >>> s4.2, TL0PICIDX: I think, given the statements in s3 about not >>> defining the hierarchy, that more explanation is needed of what the >>> 'lowest or base layer' actually means. >>> >>> s4.2, TL0PICIDX: >>>> If TID is larger than 0, TL0PICIDX >>>> indicates which base layer frame the current image depends on. >>>> TL0PICIDX MUST be incremented when TID is 0. >>> What happens if L=1 but both T=0 and K=0 so that there is no TID >>> value present? Or indeed if T=0 but K=1 so that the TID field is >>> there but 'MUST be ignored by the receiver' (definition of TID field)? >>> >>> s4.3: It would be useful to include the section number (9.1) where >>> the uncompressed data chunk specification is found. I was somewhat >>> surprised that the ordering is reversed in the protocol spec. >>> >>> s4.3, VER: The receiver should validate this field - currently only >>> values 0-3 are valid. >>> >>> s4.3, SizeN: Make it clear these are integer fields: s/in Size0,/in >>> integer fields Size0,/ >>> >>> s4.3 and s4.4: It would be helpful to implementers to explain what >>> the offsets in the first partition payload are for the additional >>> partition count and additional partition sizes. Having rummaged >>> around in RFC 6386, I have to say that I am unclear whether the >>> values are placed after the full chunk of data indicated by >>> 1stPartitionSize or are part of this data. And if the latter is the >>> case, how to work out where the partition sizes are. I guess that >>> they ought to be at offset (1stPartitionSize+10 - key frame - or >>> 1stPartitionSize+3 - interframe) in the VP8 Payload field as it is >>> difficult to work out where they are without decoding the partition >>> otherwise. Also exactly how many partition sizes to expect - I >>> think I read in RFC 6386 that the last partition size is implicit. >>> Help! >>> >>> In the course of clearing this up, it might be appropriate to move >>> the last sentence of the first para of s4.4 and the second para of >>> s4.4 into s4.3 as part of the explanation. This is the relevant text: >>>> Note that the length of the first partition can >>>> always be obtained from the first partition size parameter in the VP8 >>>> payload header. >>>> >>>> The VP8 bitstream format [RFC6386] specifies that if multiple DCT/WHT >>>> partitions are produced, the location of each partition start is >>>> found at the end of the first (prediction/mode) partition. In this >>>> RTP payload specification, the location offsets are considered to be >>>> part of the first partition. >>> >>> >>> s4.4: I found this section very difficult to parse. It is a bit >>> 'stream of consciousness' and would benefit from a reordering and >>> rewrite. Suggestion (assuming the second para gets moved to s4.3): >>> NEW SECTION 4.4: >>> An encoded VP8 frame can be divided into two or more partitions, as >>> described in Section 1. It is OPTIONAL for a packetizer implementing >>> this RTP specification >>> to pay attention to the partition boundaries within an encoded frame. >>> If packetization of a frame is done without considering the partition >>> boundaries, the PID field MAY be set to zero for all packets, and the >>> S bit MUST NOT be set to one for any other packet than the first. >>> >>> If the preferred usage suggested in Section 3 is followed, with each >>> packet >>> carrying data from exactly one partition, the S bit and PID fields >>> described in Section 4.2 SHOULD be used to indicate what the packet >>> contains. The PID field should indicate which partition the first >>> octet of the payload belongs to, and the S bit indicates that the >>> packet starts on a new partition. >>> >>> If the packetizer does not pay attention to the partition >>> boundaries, one >>> packet can contain a fragment of a partition, a complete partition, >>> or an >>> aggregate of fragments and partitions. There is no explicit >>> signaling of >>> partition boundaries in the payload and the partition lengths at the >>> end of >>> the first partition have to be used to identify the boundaries. >>> Partitions >>> MUST be aggregated in decoding order. Two fragments from different >>> partitions MAY be aggregated into the same packet along with one or >>> more >>> complete partitions. >>> >>> In all cases, the payload of a packet MUST contain data from only >>> one video >>> frame. Consequently the set of packets carrying the data from a >>> particular >>> frame will contain exactly one VP8 Payload Header (see Section 4.3) >>> carried >>> in the first packet of the frame. The last, or only, packet >>> carrying data for the >>> frame MUST have the M bit set in the RTP header. >>> >>> s4.5.1: Th algorithm only applies for the RECOMMENDED use case with >>> partitions in separate packets. >>> This should be noted. >>> >>> s5.2, last para: s/only parts of the frame is corrupted./only part >>> of the frame is corrupted./ >>> >>> s6: s/This payload format has two required parameters./This payload >>> format has two optional parameters./ >>> [I think this is probably a hangover from the previous version.] >>> >>> s7: s/Cryptographic system may also allow/The cryptographic system >>> may also allow/ >>> >>> s7: >>>> the usage of SRTP [RFC3711] is recommended. >>> RECOMMENDED? >>> >>> s10.2: I suspect that RFC 3551 is normative. -- Surveillance is pervasive. Go Dark.