Hi Dale,
First of all thank you very much for your thorough feedback and sorry very much for the late reply.
I will answer the topics inline and open relevant issues on github so we can have better tracking.
As a general comment, while reviewing the feedback gathered from the different directorates reviews, I have found conflicting requests regarding to both simultaneously further document behaviours and add extra clarifications for areas that are outside their main expertise area of the reviewers, while requesting the opposite, remove clarifications to not be duplicated//conflicting with what is already specified elsewhere, when it fits in their expertise domain.
So while I will try to accommodate all the feedback, I think that it is important to align the document with the target audience and the purpose of spec when choosing areas should be more verbose documented vs the one that we should just keep raw references.
On Tue, Aug 8, 2023 at 5:46 PM Dale Worley via Datatracker <noreply@xxxxxxxx> wrote:
Reviewer: Dale Worley
Review result: Ready with Issues
I am the assigned Gen-ART reviewer for this draft. The General Area
Review Team (Gen-ART) reviews all IETF documents being processed
by the IESG for the IETF Chair. Please treat these comments just
like any other last call comments.
For more information, please see the FAQ at
<https://trac.ietf.org/trac/gen/wiki/GenArtfaq>.
Document: draft-ietf-wish-whip-09
Reviewer: Dale R. Worley
Review Date: 2023-08-08
IETF LC End Date: 2023-08-08
IESG Telechat date: [not known]
Summary:
This draft is on the right track but has open issues, described in
the review.
The exposition could be clarified in several places. A few of the
clarifications could be considered technical issues. I list the
issues in textual order, with an initial summary of the most important
issues.
* Summary of the major issue:
A very important aspect of this document is that it defines the usage
of "ICE via SDP via HTTP", equivalently, "supporting ICE offer/answer
over HTTP" (as opposed to the original "ICE via SDP via SIP"). That
definition likely will have broader usage than just WHIP. But the
document suffers from the "expert's disease", in that the authors
clearly have deep knowledge of ICE and consequently only document the
details of the ICE processing without providing any of the framework.
This leaves naive readers to reconstruct the big picture. I was going
to add "As far as I can tell, all of the needed details are listed
somewhere in section 4", but as you can see below, once I wrote an
outline of what is needed, there are a few points that don't seem to
be specified, at least not to the point that I recognized it.)
I don't think I agree with that. The document defines a signaling protocol for establishing a WebRTC session between a server and client.
ICE is just one of the protocol layers involved (DTLS and RTP/RTCP are the other ones). ICE has a more visible control layer for the applications, and that's why it has some REST operations exposed.
I think that it is fair to require some knowledge about WebRTC (and HTTP) as a precondition for reading the draft.
Nevertheless, I have done some editorial work on the introduction and overview section that I hope makes it more easy to read.
I suggest that section 4 be organized by:
- stating that it is defining "supporting ICE offer/answer
over HTTP" (so that it can be clearly referenced by later
specifications)
- specifying the abstract operations involved (with the mapping to
specific operations listed either in this list or a later list):
- - starting the ICE negotiation (via POST carrying
application/sdp)
- - updating the ICE when trickling (via PATCH carrying
application/trickle-ice-sdpfrag)
- - restarting ICE (via PATCH carrying (via PATCH carrying
application/trickle-ice-sdpfrag, but it's not clear how this is
distinguished, RFC 8840 does not specify it; perhaps because it
carries ice-ufrag/ice-pwd attributes (see section 4.1)?)
- - termination (via DELETE from the client; not clear how from the
server)
- specifying how WHIP constrains the ICE abstraction (Media server
MUST not do trickle ICE updates, not clear if Media server MAY do
ICE restarts, Media server MAY not support trickle ICE or ICE
restart, etc.)
I will add a small introduction about the role of ICE protocol and the operations involved before hard jumping into the spec section.
- discussion of how out-of-order requests may arise (which isn't at
all clear to me, as requests seem to be generated only by the
server, and carried by TCP, so they seem to be inherently
serialized)
HTTP requests may not be sent in the same TCP connection, so they may be received out of order by the media server (especially if they traverse different proxies due to load balancing).
- discussion of how ETags MAY/MUST be handled, as that is
comprehensible only when one looks at the serialization needs of all
of the operations together
I have received a feedback from the HTTPDIR which seems to be in the opposite direction: https://github.com/wish-wg/webrtc-http-ingest-protocol/issues/141
I will try to work on this and try to find a solution that works for both.
- specifying the particular features of each of the operations,
including any particulars of request and response fields and what
response codes are to be used for what error situations
I don't think I have fully understood this feedback.
* Summary of minor issues that might have technical content:
At various points in section 4 the text refers to a device sending a
request or generating a response but in many of them, the text doesn't
state whether the device is on the client or server end, or might be
both. I assume that the nature of the operation implies what devices
might perform it based on text elsewhere, but it's hard to fully
understand on the first reading pass. If possible, each sentence
should make this clear. -- Then again, if section 4 is reorganized to
describe generic ICE/SDP/HTTP usage, attention should be paid that the
generic usage may define how e.g. the server does an operation, even
if WHIP servers may not do that operation.
I have not found a concrete example of this issue in the draft, could you provide one?
By definition, clients send the HTTP requests and servers respond to them, so I am a bit lost on this one.
Section 4 lists a collection of 4xx and 5xx response codes to be used
in certain situations. Are these set due to the specific semantics of
the codes or just to ensure that the recipient can tell what the cause
of the error was? (HTTP suffers from having a large number of error
responses but all of them have vague semantics. Unfortunately, there
doesn't seem to be an HTTP response header that can carry codes
defined for ICE usage specifically.)
We are mapping the HTTP error codes semantics to the specific errors that we can forese for WHIP operations in order to give guidance to the client about what the root cause of the issue could be.
Section 4 requires the WHIP server to reject certain specific request
methods, but no others. Verify that you intend this specific
limitation.
Yes, that is the intend, we have received some further request to clarify the specific error codes: https://github.com/wish-wg/webrtc-http-ingest-protocol/issues/135
Section 4.1 references sections 6.6.2, 2.3, and 3.1 of RFC 9110. All
of these references appear to be incorrect, in that those sections do
not discuss the topics at hand. Section 4.3 references section 6.4.7
of RFC 9110, which does not exist. Some of these appear to be
referencing sections in RFC 7231, which is obsoleted by RFC 9110.
Good catch, I updated the rfc reference, but not the sections:
Section 4.1 talks about out-of-order PATCH requests but is unclear
about what may cause them, as the immediate implication is that
somehow the client is sending overlapping PATCH requests. Probably
the authors fully understand the allowed sequences of operations that
cause this but the text needs to clarify what is being described.
We are not specifying any requirement on HTTP transports (i.e. they could go over different TCP connections) so if you send two independent HTTP requests, they may be received out of order by the server.
Section 4.1 has inconsistent statements about a WHIP resource's
response code if the client requests an ICE restart but the resource
does not support ICE restart.
This was also raised by httpdir:
Section 4.2 extends RFC 8842, and this document should be marked as
doing so.
I will ask for chairs advice on this.
In Section 4.4, I notice that there's no explicit differentiation
between the STUN and the TURN server information in returned Link
header fields. Please verify that either (1) the presence of username
and credential-type differentiates the two cases, or (2) the Media
client/server do not need to differentiate the two cases.
The scheme part of the url of the server differentiates between a STUN and a TURN server:
Link: <stun:stun.example.net>; rel="ice-server"Link: <turn:turn.example.net?transport=udp>; rel="ice-server";
username="user"; credential="myPassword"; credential-type="password"
Section 4.7 requires POST responses to include Link headers for any
WHIP extensions supported by the server. Perhaps OPTIONS responses
should be required to do so also.
The Link headers not only describe the presence of the extension, but also the URL that can be used as an entry point for that extension. Those are WHIP Session/Resource specific and are not known when the OPTIONS request is received.
The designators of WHIP extensions are described as URIs. In fact,
they are URNs, and it would be more precise to refer to them as URNs
in the text.
Will do
* All issues (most of them exposition/editorial), in text order:
I have included all the changes below in the following PR:
1. Introduction
It also describes how to negotiate media flows
using the Offer/Answer Model with the Session Description Protocol
(SDP) [RFC3264] as well as the formats for data sent over the wire
(e.g., media types, codec parameters, and encryption).
I think "as well as" should be "including".
Done.
RTSP [RFC7826], which is based on RTP, is
not compatible with the SDP offer/answer model [RFC3264].
This statement might be clarified, as the naive reader (me) considers
offer/answer to also be based on RTP. I suspect that the critical
meaning is "RTSP, unlike SDP offer/answer, has no provisions for
negotiating the characteristics of the media session."
Changed to:
RTSP {{?RFC7826}}, which is based on RTP, does not support the SDP offer/answer model {{!RFC3264}} for negotiating the characteristics of the media session.
Would that work for you?
This document proposes a simple protocol for supporting WebRTC as
media ingestion method which:
I would amend to "... a simple protocol, WHIP, for ..." to have a
clear assertion in the introduction of what WHIP is. I would also add
"HTTP", as the preceding text describes what WHIP *isn't* based on:
This document proposes a simple protocol, WHIP, based on HTTP, for
supporting WebRTC as media ingestion method which:
--
* Allows for ingest both in traditional media platforms and in
WebRTC end-to-end platforms with the lowest possible latency.
Verify that "ingest" is used in the field as shorthand for "ingestion"
(rather than being a typo). The text seems to use both words with the
same meaning; it should probably use only one.
I am not an native english speaker, so I asked ChatGPT to rewrite it, would this be better?
"Enables ingestion on both traditional media platforms and WebRTC end-to-end platforms, achieving the lowest possible latency."
* Is usable both in web browsers and in native encoders.
I would say "standalone encoders", but perhaps this is the industry
terminology.
Changed.
2. Terminology
For clarity, I would move Figure 1 to after paragraph 1 of section 2
and amend it to something like this. Or perhaps better, relocate the
current section 3 before section 2, so it can provide the context for
the glossary.
+--------------+ +-------------+ +---------------+ +--------------+
| Media client | | WHIP client | | WHIP server | | Media server |
+------+-------+ +--+----------+ +---------+---+-+ +------+-------+
| | | | |
| | | +-+-------------+ |
| | | | WHIP resource | |
| | | +--------|------+ |
| | | | |
| |HTTP POST (SDP Offer) | | |
| +------------------------>+ | |
| |201 Created (SDP answer) | | |
| +<------------------------+ | |
| | ICE REQUEST | |
+---------------------------------------------------------------->+
| | ICE RESPONSE | |
|<----------------------------------------------------------------+
| | DTLS SETUP | |
|<===============================================================>|
| | RTP/RTCP FLOW | |
+<--------------------------------------------------------------->+
| HTTP DELETE |
+----------------------------------->|
| 200 OK |
<------------------------------------+
The reason to move it before the definitions is that definitions would
be a lot clearer with the diagram already in mind.
This chapter has been rewritten based on different directorate feedback, I think it would address also your feedback as well.
The "WHIP client" is separated into two components, one dealing with
HTTP and SDP and one dealing with RTP, with both being involved in
ICE, in the same way that the server is divided into two components
with the same division of responsibilities.
I suspect that historically, the server-side is conceptualized that
the "Media Server" exists a priori with the "WHIP endpoint" being
added to it as an extra unit, but the client-side is conceptualized as
being a unitary system. That structure makes the specification harder
to understand, though, and I think showing the client and server with
more parallel structures would clarify the specification.
Traditionally, server side components have two different components, the signaling one, (which deals with HTTP and signaling) and the media server (which deals with ICE/DTLS/SRTP). While they can be co-located in the same server/daemon, they are logically different entities.
The "WHIP resource" is shown as being a dependent of the "WHIP
server".
The "WHIP endpoint" is renamed "WHIP server". At least in the SIP
world, "endpoint" is used to refer to *both* ends of one connection,
and so the term "WHIP endpoint" would be expected to include the WHIP
client as well.
I will remove the term WHIP server, as it is not defined anywhere and use the proper terminology in each case.
The capitalization of "Media server" is made consistent with the other
terms. (See also the entry in the glossary.)
Make it lower case always.
Although conceptually the left two items (Media client and WHIP
client) and the right three items (WHIP server, WHIP resource, and
Media server) collectively form the two end-systems of the
client/server relationship, I have not tried to invent names for the
two collections because there doesn't seem to be a need to reference
them as wholes.
These changes do require adjusting the terminology in the rest of the
document.
3. Overview
See comments on section 2.
4. Protocol Operation
The SDP offer SHOULD use the "sendonly" attribute and the SDP answer
MUST use the "recvonly" attribute in any case.
You may want to extend this to "SHOULD use the "sendonly" attribute
but MAY use the "sendrecv" attribute", since the other possible values
are useless.
"in any case" seems redundant given the presence of "MUST".
Changed to:
The SDP offer SHOULD use the "sendonly" attribute or MAY use the "sendrecv" attribute instead, "inactive" and "recvonly" attributes MUST NOT be used. The SDP answer MUST use the "recvonly" attribute.
POST /whip/endpoint HTTP/1.1
I recommend preceding each of the HTTP messages with a description of
what the endpoints are. E.g. "POST request from WHIP client to WHIP
server" and "201 Created response from WHIP server to WHIP client".
Figure 2: HTTP POST doing SDP O/A example
Perhaps more informative as "Example WHIP POST executing SDP
offer/answer".
There is not such thing as "WHIP POST", renamed to:
"Example of SDP offer/answer exchange done via an HTTP POST"
would that work for you?
Once a session is setup, ICE consent freshness [RFC7675] SHALL be
used to detect non graceful disconnection and DTLS teardown for
session termination by either side.
I think the meaning is "Once a session is set up, ICE consent
freshness [RFC7675] MUST be maintained. Both endpoints MUST monitor
freshness and MUST tear down the connection if freshness is lost."
(Note I don't know how ICE consent freshness is done, so I am assuming
that no further details need be specified to ensure interoperability.)
Consent Freshness is a term defined in RFC7656 as "Maintaining and renewing consent over time.", and is the name of the whole procedure to apply. Will remove the ICE part as this is not defined like that anywhere.
Probably the 2nd following paragraph ("A Media Server terminating...")
can be combined with this paragraph. Also, expand it to place the
same requirement on the Media client.
WHIP clients and media servers do not have the same requirements on session termination, as the media server will only perform the shutdown in abnormal cases and the WHIP client is mandated to terminate it by explicit sending an HTTP DELETE command.
The WHIP endpoints MUST return an "405 Method Not Allowed" response
for any HTTP GET, HEAD or PUT requests on the endpoint URL in order
to reserve its usage for future versions of this protocol
specification.
This and the 2nd following paragraph only forbid a few methods, and
allow an implementation to implement any others as extensions. Please
verify that these are exactly what you need. Also, the wording should
be "reserve their usage" because the referent is "any ... requests".
I have changed the wording to exclude all methods except the ones defined in the document for each URL
The WHIP endpoints MUST return an "405 Method Not Allowed" response for any HTTP request method different than OPTIONS and POST on the endpoint URL in order to reserve their usage for future versions of this protocol specification.
The WHIP sessions MUST return an "405 Method Not Allowed" response for any HTTP request method different than PATCH and DELETE on the session URLs in order to reserve their usage for future versions of this protocol specification.
The WHIP sessions MUST return an "405 Method Not Allowed" response for any HTTP request method different than PATCH and DELETE on the session URLs in order to reserve their usage for future versions of this protocol specification.
The WHIP endpoints MUST support OPTIONS requests for Cross-Origin
Resource Sharing (CORS) as defined in [FETCH] and it SHOULD include
an "Accept-Post" header with a mime type value of "application/sdp"
on the "200 OK" response to any OPTIONS request received as per
[W3C.REC-ldp-20150226].
I think this would be less awkward as
The WHIP endpoints MUST support OPTIONS requests for Cross-Origin
Resource Sharing (CORS) as defined in [FETCH]. The "200 OK"
response to any OPTIONS request SHOULD include an "Accept-Post"
header with a mime type value of "application/sdp" as per
[W3C.REC-ldp-20150226].
Changed.
4.1. ICE and NAT support
Text might be added to the beginning of this section similar to the
text at the beginning of section 4.2, as WHIP puts specific
restrictions on ICE usage for simplicity's sake. It seems like "NAT"
should be replaced by something else, as "NAT" is not mentioned
elsewhere in the document. "STUN/TURN" seems to be natural, but there
is a section for that (4.4). Perhaps this section should be titled
"ICE usage".
Changed title to ICE support, I will add an introductory paragraph talking about ICE and NAT.
Also, several subsections of section 4 seem to be about usage and
restrictions of various protocols; perhaps they should have parallel
names e.g. "ICE usage", "WebRTC usage", "STUN/TURN usage".
I am not sure if I like that better, will review it one we have the full editorial changes merged.
In order to simplify the protocol, there is no support for exchanging
gathered trickle candidates from Media Server ICE candidates once the
SDP answer is sent.
I think this would be clearer as follows, since that moves the fact we
are talking about the Media server earlier in the sentence:
In order to simplify the protocol, there is no support for the
Media server sending further trickle candidates once the SDP answer
is sent.
I have simplified it further to:
In order to simplify the protocol, exchanging media server's gathered ICE candidates after sending the SDP answer is not supported.
--
The WHIP client MAY perform trickle ICE or ICE restarts as per
[RFC8838] by sending an HTTP PATCH request ...
PATCH is an unusual method, defined in RFC 5789, and it seems like
there should be a reference for it here.
Added
Also, you probably want to insert here a statement whether the WHIP
resource MAY *initiate* ICE restarts; as far as I can tell, the text
does not address this.
Also, you want to directly state that the semantics (meaning) of an
ICE update/restart request and thus how it is to be processed, is
determined by the body, as specified in RFC 8840. The current text
leaves that silently implied.
Will do that as part of https://github.com/wish-wg/webrtc-http-ingest-protocol/issues/147
If the WHIP resource supports either Trickle ICE or ICE restarts, but
not both, it MUST return a "405 Not Implemented" response for the
HTTP PATCH requests that are not supported.
If the WHIP resource does not support the PATCH method for any
purpose, it MUST return a "501 Not Implemented" response, as
described in [RFC9110] Section 6.6.2.
The canonical description for "405" is "Method not allowed". But I
would reorganize and combine these paragraphs along these lines:
If the WHIP resource supports either Trickle ICE or ICE restarts,
but not both, it MUST return a "405 Method not allowed" response
for HTTP PATCH requests for the feature that is not supported. If
neither feature is supported, it MUST return a "501 Not
Implemented" response for such HTTP PATCH requests.
This allows PATCH requests for other purposes, if any of those are
ever defined.
Will do it alongside https://github.com/wish-wg/webrtc-http-ingest-protocol/issues/135
BTW, the reference "[RFC9110] Section 6.6.2" ("Trailer") seems to be
unrelated to the text; please verify it.
Yes, I updated the RFC reference, but not the section, should be 15.6.2
As the HTTP PATCH request sent by a WHIP client may be received out-
of-order by the WHIP resource, ...
To clarify this, I would expand it to:
The WHIP client MAY send overlapping HTTP PATCH requests to one
WHIP resource. As the HTTP PATCH request sent by a WHIP client may
be received out-of-order by the WHIP resource, ...
Reworded to:
The WHIP client MAY send overlapping HTTP PATCH requests to one WHIP session. Consequently, as those HTTP PATCH requests may be received out-of-order by the WHIP session, the WHIP session...
However it is unclear to me why the client would want to do this.
Perhaps the concept is that the server may also be sending PATCH
requests for trickle ICE etc.? In that case, you would want to phrase
it more like:
As the WHIP client and the WHIP resource may be both be sending
HTTP PATCH requests without coordination, an HTTP PATCH request
sent by a WHIP client may be received out-of-order by the WHIP
resource, ...
Only the WHIP client sends HTTP PATCH requests to the WHIP server.
In any case, you should ensure you understand the possible
complications and explain them here to the reader. Also, the final
sentence of the paragraph is:
Note that including the ETag
in the original "201 Created" response is only REQUIRED if the WHIP
resource supports ICE restarts and OPTIONAL otherwise.
I *think* this means that the entirety of the preceding paragraph is
conditional "If the WHIP resource supports ICE restarts, the following
complications are possible ... and so the WHIP resource must return
ETag header fields ...". Whatever the causation/conditions are, they
should be put at the start of the paragraph.
A WHIP client sending a PATCH request for performing trickle ICE MUST
include an "If-Match" header field with the latest known entity-tag
as per [RFC9110] Section 3.1. When the PATCH request is received by
the WHIP resource, it MUST compare the indicated entity-tag value
with the current entity-tag of the resource as per [RFC9110]
Section 3.1 and return a "412 Precondition Failed" response if they
do not match.
Beware that this paragraph appears to be a continuation of the
preceding paragraph, in that the sentence "Note ..." appears to be a
condition on it also. If that is true, when you relocate the
condition to before the first paragraph, the condition should itself
be paragraph-level, so it is clear that its scope is not ended by the
end of the first paragraph.
Made the conditonal requirement happen on the first phase instead.
A WHIP resource receiving a PATCH request with new ICE candidates,
but which does not perform an ICE restart, MUST return a "204 No
Content" response without body.
According to the previous text, if a WHIP resource receives a PATCH
requesting an ICE restart but it does not implement ICE restart, it
MUST return either 405 or 501. This needs to be clarified; perhaps
there is a difference between "does not support" and "does not
perform"?
The "does not perform an ICE restart" is about the PATCH request, will clarify it.
If the Media Server does not support
a candidate transport or is not able to resolve the connection
address, it MUST accept the HTTP request with the "204 No Content"
response and silently discard the candidate.
Note that the Media server does not generate HTTP responses in any
case. I assume the meaning is "the WHIP resource must accept ...". I
think the intended meaning is
Right, changed.
If a WHIP resource receives a PATCH request containing a new ICE
candidate with a transport that the Media Server does not support
or is unable to resolve the address, the resource must MUST
silently discard the candidate and process the remainder of the
request. Typically, it will accept the HTTP request with the "204
No Content" response.
The last sentence is written as it is because the resource may have
some other reason for rejecting the PATCH.
Changed to this instead:
it MUST silently discard the candidate and continue processing the rest of the request normally.
A WHIP client sending a PATCH request for performing ICE restart MUST
contain an "If-Match" header field with a field-value "*" as per
[RFC9110] Section 3.1.
Naively, this seems questionable, as presumably all ICE operations
need to be serialized. Or is it that ICE restart causes all preceding
ICE updates to become irrelevant? Again, some explanation of the
allowed sequencing of ICE operations and the necessary
serialization/dependencies would be good introduction.
ICE restarts reset any previous state, so all pending UPDATES will be irrelevant.
Also, the "200 OK" response for a successful ICE restart MUST contain
the new entity-tag corresponding to the new ICE session in an ETag
response header field and MAY contain a new set of ICE candidates for
the Media Server.
This reads oddly because it combines "new ICE session" with the
possibility that the triggering PATCH request may not include new ICE
candidates. What is the implied processing if there are no
candidates? There are two possibilities, one that the new ICE session
starts with no candidates (in that direction), the other is that the
previously specified set of candidates is retained. The latter
possibility implies that the new session depends on the previous
session and introduces serialization requirements.
The new ICE session can be started without candidates from the WHIP client. They may be sent in an follow up PATCH request, or could be not needed/sent at all.
If the ICE request cannot be satisfied by the WHIP resource, the
resource MUST return an appropriate HTTP error code and MUST NOT
terminate the session immediately.
You probably mean "If the ICE update or restart request ...". Also,
you probably mean "... and MUST continue the previous ICE session."
It is only about ICE restarts, added it.
In either case, the session MUST be
terminated if the ICE consent expires as a consequence of the failed
ICE restart as per [RFC7675] Section 5.1.
Probably intensify to "In any case, ...".
Changed
Because the WHIP client needs to know the entity-tag associated with
the ICE session in order to send new ICE candidates, it MUST buffer
any gathered candidates before it receives the HTTP response to the
initial POST request or the PATCH request with the new entity-tag
value.
I think this could be clarified by putting more of the sentence in
time order:
Because the WHIP client needs to know the entity-tag associated
with the ICE session in order to send a PATCH containing new ICE
candidates, it MUST wait until it receives the HTTP response to the
initial POST request or the previous PATCH request before it can
send any candidates that were gathered after the previous request
was sent.
--
changed to:
Because the WHIP client needs to know the entity-tag associated with the ICE session in order to send a PATCH request containing new ICE candidates, it MUST wait and buffer any gathered candidates until it receives the HTTP response with the new entity-tag value to either initial POST request or the last PATCH request performing an ICE restart.
... the STUN requests will contain invalid ICE information and will
be rejected by the server. When this situation is detected by the
WHIP Client, it MUST ...
Given that this is a MUST, we need a more rigid specification of the
specific responses that compel this behavior in the client. Also,
"the server" is used, but apparently in the sense "the STUN server",
not "the Media server" as elsewhere in the text. Also, it seems to be
implied that the Media server will not do ICE restart if it receives
these errors from the STUN server. Is that true?
Changed to:
If there is a mismatch between the ICE information at the WHIP client and at the WHIP session (because of an out-of-order request), the STUN requests will contain invalid ICE information and will be dropped by the receiving side. If this situation is detected by the WHIP Client, it MUST send a new ICE restart request to the server.
4.2. WebRTC constraints
In the specific case of media ingestion into a streaming service,
Given that the abstract says "ingestion of content into streaming
services and/or CDNs", this appears to confine the paragraph to
discussing a subset of WHIP implementations. This clause should
probably be revised to clarify that it covers all WHIP usages.
Removed the whole phrase as the draft-ietf-rtcweb-gateways-02 is only relevant for adding ICE lite support, which we already done in previous sections.
Both the WHIP client and the WHIP endpoint SHALL use SDP bundle
[RFC9143]. ...
Please verify that all proper specifications are both "MUST support"
and "MUST use". E.g., there is "... will use RTP/RTCP multiplexing
..." which should say "SHALL"/"MUST".
Yes, it is MUST support and use, added clarification
... bundled "m=" sections as per [RFC8858] i.
The ending "i" seems like a typo, but please verify it isn't the
truncation of intended text.
Removed
... in a single MediaStream as defined in [[!RFC8830]] ...
It appears that RFC 8830 is an intended reference but it is not listed
in section 8.
fixed
When a WHIP client sends an SDP offer, it SHOULD insert an SDP
"setup" attribute with an "actpass" attribute value, as defined in
[RFC8842]. However, if the WHIP client only implements the DTLS
client role, it MAY use an SDP "setup" attribute with an "active"
attribute value. If the WHIP endpoint does not support an SDP offer
with an SDP "setup" attribute with an "active" attribute value, it
SHOULD reject the request with a "422 Unprocessable Entity" response.
NOTE: [RFC8842] defines that the offerer must insert an SDP "setup"
attribute with an "actpass" attribute value. However, the WHIP
client will always communicate with a Media Server that is expected
to support the DTLS server role, in which case the client might
choose to only implement support for the DTLS client role.
Note this means that this document updates RFC 8842, and should be
marked as such.
I will ask for chairs advice on this.
This part seems to be incomplete or inconsistent. Clearly, a WHIP
server MUST support the DTLS server role, and so it MUST be able to
accept "a=setup:active". This means that the described 422 error
response should never occur.
The media server must support the DTLS server role, but the WHIP endpoint may not be able to parse the SDP containing an "setup:active" as the standard is "setup:actpass".
4.3. Load balancing and redirections
WHIP endpoints and Media Servers might not be colocated on the same
server, so it is possible to load balance incoming requests to
different Media Servers.
In stead of the passive "it is possible ...", it would be clearer to
say "the WHIP server may load balance incoming requests to multiple
Media servers."
Being pedantic, the redirection could (and most probably will happen), in a network layer before the WHIP endpoints (nginx reverse proxies for example). We don't have the term "service provider" used before, so I think that the passive voice is appropriate.
WHIP clients SHALL support HTTP redirection
via the "307 Temporary Redirect" response as described in [RFC9110]
Section 6.4.7.
RFC 9110 does not contain a section 6.4.7.
Updated section reference to 15.4.8.
4.4. STUN/TURN server configuration
A reference to each STUN/TURN server will be returned using the
"Link" header field [RFC8288] with a "rel" attribute value of "ice-
server".
I notice that there's no explicit differentiation between the STUN and
the TURN server information. Please verify that either (1) the
presence of username and credential-type differentiates the two cases,
or (2) the Media client/server do not need to differentiate the two
cases.
As explained above, the differentiation is done by the url scheme "turn:" vs "stun:"
Figure 5: Example ICE server configuration
It seems the caption should be "Example STUN/TURN server
configuration".
Changed to align to the section title.
The generation of the TURN server credentials may require performing
a request to an external provider, which can both add latency to the
OPTIONS request processing and increase the processing required to
handle that request. In order to prevent this, the WHIP Endpoint
SHOULD NOT return the STUN/TURN server configuration if the OPTIONS
request is a preflight request for CORS, that is, if The OPTIONS
request does not contain an Access-Control-Request-Method with "POST"
value and the the Access-Control-Request-Headers HTTP header does not
contain the "Link" value.
I think the normative structure can be made clearer as:
In order to avoid adding latency and processing to an OPTIONS
request, the WHIP server SHOULD NOT return return the STUN/TURN
server configuration in OPTIONS responses unless either (1) the
configuration can be determined without extra processing or latency
(e.g. by performing a request to an external provider), or (2) if
the OPTIONS request is an XXX, and contains an
Access-Control-Request-Method with "POST" value and the the
Access-Control-Request-Headers HTTP header contains the "Link"
value.
"is an XXX" optional and is to clarify the situation that exception
(2) supports. From the current text, I think it should be "is not a
preflight request for CORS". If "preflight request for CORS" is
retained, please add a reference to "[FETCH]" here, as the only other
mention of CORS in the document which has the reference to "[FETCH]"
is a long way away.
The OPTIONS request for stun/turn server configuration is heavily discouraged, so I would prefer not to change the mandatory language. I have added the fetch reference.
It might be also possible to configure the STUN/TURN server URIs with
long term credentials provided by either the broadcasting service or
an external TURN provider on the WHIP client, overriding the values
provided by the WHIP endpoint.
This should start with that it is the WHIP client that is being
configured:
The WHIP client MAY be configured with the STUN/TURN server URIs
and long term credentials provided by either the broadcasting
service or an external TURN provider, overriding the values
provided by the WHIP endpoint.
I have rephrased as follows:
The WHIP clients MAY also support configuring the STUN/TURN server URIs with long term credentials provided by either the broadcasting service or an external TURN provider, overriding the values provided by the WHIP endpoint.
4.7. Protocol extensions
The WHIP endpoint MUST
return one "Link" header field for each extension, with the extension
"rel" type attribute and the URI for the HTTP resource that will be
available for receiving requests related to that extension.
This seems awkward to me. Perhaps
The WHIP endpoint MUST return one "Link" header field for each
extension that it supports, with the "rel" attribute having the
extension URN as its value and the URI being suitable for that
extension.
The part "the URI being suitable for that extension" recognizes that
we can't predict what the semantics of the URI will be for a future
extension, but the extension will define the semantics.
The protocol extensions must be HTTP based, so I don't think that the URI part is needed. Reworded the rest as:
The WHIP endpoint MUST return one "Link" header field for each extension that it supports, with the extension "rel" attribute value containing the extension URN and the URI for the HTTP resource that will be available for receiving requests related to that extension.
Protocol extensions supported by the WHIP server MUST be advertised
to the WHIP client in the "201 Created" response to the initial HTTP
POST request sent to the WHIP endpoint.
It seems like a good idea to require protocol extension support
declaration in OPTIONS responses as well.
As stated above, the extensions URIs are only known when the session is created, so they are not available to OPTIONS requests.
5. Security Considerations
The writers should do a spelling-check pass over section 5, as there
are enough misspellings to inconvenience the Editor.
On top of that, the WHIP protocol exposes a thin new attack surface
specific [sic] of the REST API methods used within it:
Yes, it was already risen and reviewed.
It seems like the following three items could be condensed, as most of
the text enumerates problems that are well-known or clear from the
context. Perhaps something like:
* HTTP flooding and resource exhaustion: Both the WHIP server and
the WHIP resources SHOULD implement a rate limit and avalanche
control mechanism for incoming requests.
* Security of WHIP resource URLs: Servers for WHIP resource URLs
SHOULD either implement authentication and authorization
similarly to WHIP servers (see section 4.5) or use URLs that are
capabilities, in that they are unguessable and the possession of
the URL shows the client has the right to access the resource.
IIRC we received feedback on the opposite direction about being more verbose about the security considerations, would prefer to leave it as is.
6.1. Link Relation Type: ice-server
The link relation type below has been registered by IANA per
Section 4.2 of [RFC8288].
I do not see "ice-server" in
https://www.iana.org/assignments/link-relations/link-relations.xhtml,
so it appears the wording should be "IANA is asked to register ...".
6.3.1. Specification Template
* The designated contact shall be responsible for reviewing and
enforcing uniqueness.
Should this be "designated expert"? Compare with sections 6.4.1 and
6.4.2.
Both texts above has been revised by IANA already, would prefer not to change without their input.
6.4.1. Registration Procedure
The RFC SHOULD include any attributes defined.
If this means what I think it does, it should be expanded to something
like
The RFC MUST include the syntax and semantics of any
extension-specific attributes that may be provided in a Link header
field advertising the extension.
Changed
[END]
This was intense... 😅
Thank you once again for your thorough review.
Sergio
-- last-call mailing list last-call@xxxxxxxx https://www.ietf.org/mailman/listinfo/last-call