Re: Tsvart last call review of draft-ietf-ice-rfc5245bis-16

Christer Holmberg <christer.holmberg@xxxxxxxxxxxx> · Fri, 2 Feb 2018 11:12:11 +0000

Hi,

Most issues seem to ve solved now, but I still have some comments on the
ICMP suggestion, and the math stuff at the end.

>>>>>Significant Issues:
>>>>>
>>>>> A. Section 5.2:
>>>>>
>>>>>      Lite implementations only utilize host candidates.  A lite
>>>>>      implementation MUST, for each component of each data stream,
>>>>> allocate
>>>>>      zero or one IPv4 candidates.  It MAY allocate zero or more IPv6
>>>>>      candidates, but no more than one per each IPv6 address utilized
>>>>>by
>>>>>      the host.  Since there can be no more than one IPv4 candidate
>>>>>per
>>>>>      component of each data stream, if an ICE agent has multiple IPv4
>>>>>      addresses, it MUST choose one for allocating the candidate.  If
>>>>>a
>>>>>      host is dual-stack, it is RECOMMENDED that it allocate one IPv4
>>>>>      candidate and one global IPv6 address.  With the lite
>>>>> implementation,
>>>>>      ICE cannot be used to dynamically choose amongst candidates.
>>>>>      Therefore, including more than one candidate from a particular
>>>>> scope
>>>>>      is NOT RECOMMENDED, since only a connectivity check can truly
>>>>>      determine whether to use one address or the other.
>>>>>
>>>>> I find it quite strange that the above text says there can only be
>>>>> single
>>>>> IPv4
>>>>> based candidate, while for IPv6 a LITE implementation may have one
>>>>> candidate
>>>>> per IPv6 address. Isn't the LITE implication of having multiple
>>>>> candidates for
>>>>> the same address family similar? Yes, IPv6 kind of forces the need
>>>>>for
>>>>> dealing
>>>>> with multiple IPv6 addresses on any host. However, I can see that
>>>>> certain
>>>>> servers will actually be multi-homed in IPv4 and thus can in a
>>>>>sensible
>>>>> way
>>>>> actually have multiple IPv4 candidates, and let the clients select
>>>>> which
>>>>> interface has the best reachability.
>>>>>
>>>>> Can you please be explicit on what in ICE prevents things to work for
>>>>> IPv4 but
>>>>> the same case works for IPv6?
>>>> This is text from RFC 5245. I agree it is confusing, and
>>>>unfortunately I
>>>> don¹t have a good answer.
>>>>
>>>> I guess my approach would be to suggest that we simply remove the
>>>> restriction. In addition, there is generic text about dual-stack etc
>>>> elsewhere,
>>>> and I don¹t see anything ICE lite specific.
>>>>
>>>> OLD:
>>>>
>>>> "Lite implementations only utilize host candidates.  A lite
>>>>      implementation MUST, for each component of each data stream,
>>>> allocate
>>>>      zero or one IPv4 candidates.  It MAY allocate zero or more IPv6
>>>> candidates, but no more than one per each IPv6 address utilized by
>>>>      the host.  Since there can be no more than one IPv4 candidate per
>>>>      component of each data stream, if an ICE agent has multiple IPv4
>>>>      addresses, it MUST choose one for allocating the candidate.  If a
>>>>      host is dual-stack, it is RECOMMENDED that it allocate one IPv4
>>>>      candidate and one global IPv6 address.  With the lite
>>>> implementation,
>>>>      ICE cannot be used to dynamically choose amongst candidates.
>>>>      Therefore, including more than one candidate from a particular
>>>>scope
>>>>      is NOT RECOMMENDED, since only a connectivity check can truly
>>>>      determine whether to use one address or the other."
>>>>
>>>>
>>>> NEW:
>>>>
>>>> "Lite implementations only utilize host candidates.
>>>> With the lite implementation, ICE cannot be used to dynamically choose
>>>> amongst candidates. Therefore, including more than one candidate from
>>>>a
>>>> particular IP address family is NOT RECOMMENDED, since only a
>>>> connectivity
>>>> check can Truly determine whether to use one address or the other."
>>> I have read Ben's comment on this issue. I think something needs to be
>>> done to avoid the confusion. There are I think two alternatives,
>>> either retain the restriction with an added note on why this
>>>restriction
>>> exists. The second I think is to go with an amended version of the text
>>> proposal, possibly with even clearer wording that if you are running
>>> multiple addresses in the same family, you should really should be
>>>using
>>> a full implementation.
>>>
>>> I noted that the "NEW" text above needs an amendment, and that is
>>> because I think a very important part of what was cut needs to be
>>> retained.
>>>
>>> It MAY allocate zero or more IPv6
>>> candidates, but no more than one per each IPv6 address utilized by
>>>     the host.
>>>
>>> If one generalize this what is missing is the limitation that only a
>>> single candidate per IP address utilized by the host can be added by a
>>> lite implementation. Using multiple would be redundant, but lets use a
>>> direct statement to this affect, rather than a subordinate clause to  a
>>> MAY statement.
>> Could you suggest new text?
>
>What about:
>
>"Lite implementations only utilize host candidates. For each IP address,
>independent of IP address family, there MUST be zero or one candidate.
>With the lite implementation, ICE cannot be used to dynamically choose
>amongst candidates. Therefore, including more than one candidate from a
>particular IP address family is NOT RECOMMENDED, since only a connectivity
>check can Truly determine whether to use one address or the other.
>Instead agents that have multiple public IP addresses are RECOMMENDED to
>run full ICE implementations to ensure the best usage of its addresses."

Looks good.

---

>>>>>B. Section 6.1.1:
>>>>>
>>>>>      An agent MUST be prepared that the peer might re-determine the
>>>>> roles
>>>>>      as part of any ICE restart, even if the criteria for doing so
>>>>>are
>>>>> not
>>>>>      fulfilled.  This can happen if the peer is compliant with an
>>>>>older
>>>>>      version of this specification.
>>>>>
>>>>> What does it mean to be prepared for a peer that re-determine the
>>>>> roles?
>>>>> What
>>>>> is it one MUST do? If the peer changes its role upon an ICE restart,
>>>>> isn't that
>>>>> going to result in a role mismatch? Thus causing yet another ICE
>>>>> restart,
>>>>> where
>>>>> also this peer will re-evalute? Isn't that good enough? Or is it
>>>>> something else
>>>>> it can do?
>>>> The roles are re-negotiated during the ICE restart: it may, or may
>>>>not,
>>>> result in a role mismatch.
>>>>
>>>> "Prepared" means that the peer might change its role even though it
>>>>does
>>>> not fulfil the 5245bis criteria for being allowed to do so.
>>> Yes, but prepared is not actionable. I guess what you mean is that an
>>> implementation MUST accept and perform re-determination of the role
>>> initiated by the peer agent, even if the criteria are not fulfilled.
>> What about:
>>
>> OLD:
>>
>> An agent MUST be prepared that the peer might re-determine the roles
>>       as part of any ICE restart, even if the criteria for doing so are
>>not
>>       fulfilled.  This can happen if the peer is compliant with an older
>>       version of this specification.
>>
>>
>> NEW:
>>
>> An agent MUST accept if the peer initiates a re-determination of the
>>roles
>> even if the criteria for doing so are not fulfilled. This can happen if
>>the
>> peer is compliant with RFC 5245.
>
>Yes, I think that is much clearer and indicates what might occur, and
>what to do in the case and why.

I will modify the text as above.

---

>>>>>D. Section 6.1.4.2:
>>>>>
>>>>> I don't know if I misunderstand the algorithm here in the bullet
>>>>>list.
>>>>> To
>>>>> me it
>>>>> appears that it will terminate prior to have initiated all possible
>>>>> tests, as
>>>>> it appears that it will not unfreeze some of the candidate pairs. If
>>>>> one
>>>>> have
>>>>> tests running for a foundation, but all other candidate checks have
>>>>> been
>>>>> started, then the steps are aborted. Is the bullet list rechecked
>>>>>every
>>>>> Ta?
>>>> No. Whenever Ta fires, one check list in Running state is checked.
>>>>When
>>>> all check lists have been checked, it will start over from the top of
>>>> the
>>>> list.
>>>>
>>>> Or, did I misunderstand your issue?
>>> No, It was unclear that this is re-run every time Ta fires. I think the
>>> reason for this is this sentence in step 4.
>>>
>>> If this happens for every single check list in the
>>>         Running state, meaning there are no remaining candidate pairs
>>>to
>>>         perform connectivity checks for, abort these steps.
>>>
>>> It was unclear to me that this doesn't mean aborting the whole
>>> procedure. Because if the requirement is fulfilled then there is
>>>nothing
>>> to do until the next Ta, which will possible unfreeze another candidate
>>> pair in step 2. I would recommend that you reformulate this sentence to
>>> make it clear that the processing for this Ta is stopped, and
>>>processing
>>> will be resumed the next time Ta fires.
>> Isn¹t that indicated in the sentence before the list:
>>
>>   "Whenever Ta fires, the ICE agent will perform a check for a candidate
>>    pair within the picked check list by performing the following steps:²
>
>Yes, assuming you correctly interpret those. Which, I clearly didn't
>that is why I am requesting additional clarity. But, if you think it is
>good enough, I will let the issue drop.

I will keep the text as it is for now, but keep it in mind, in case I come
up with something.

---

>>>>> E. Section 7.2.5.2.2.  ICMP Error
>>>>>
>>>>>      An ICE agent MAY support processing of ICMP errors for
>>>>>connectivity
>>>>>      checks.  If the agent supports processing of ICMP errors, and
>>>>>if a
>>>>>      Binging request generates an ICMP error, the agent SHOULD set
>>>>>the
>>>>>      state of the candidate pair to Failed.
>>>>>
>>>>> I am a bit worried by this blanket statement on ICMP errors. I think
>>>>>it
>>>>> should
>>>>> be clarified which ICMP message types that are relevant to consider
>>>>>as
>>>>> errors?
>>>>> I assume Type 3 (Destination Unreachable) but maybe not all responde
>>>>> codes as
>>>>> Codes 4, 11,12 may be addressable in other ways, and likely Type 11
>>>>> (Time
>>>>> exceeded) with response code 0, response code 1 is not a clear
>>>>> indication
>>>>> of a
>>>>> non working path.
>>>> This is from RFC 5245.
>>>>
>>>> I don¹t think the ICE WG should go through all different codes and
>>>> combinations, and determine what should be considered an error, and
>>>>what
>>>> not.
>>>>
>>>> If you can provide something (table, guidance etc), we are happy to
>>>> include it. Otherwise I¹d like to keep it as it is, and let
>>>> implementations deal with it, as at least I am not aware that this
>>>>would
>>>> Have caused issues in ICE deployments.
>>> I think we there is a point to clarify that this applies to ICMP
>>> messages indicating a non-usable path. So maybe it could be rewritten
>>>to
>>> something like this:
>>>
>>>     An ICE agent MAY support processing of ICMP messaging indicating a
>>> non-functioning path for connectivity
>>>     checks. ICMP messages of type 3 (Destination Unreachable) are
>>> indicators of a currently non-functioning path. However, the issue can
>>>be
>>> temporary as it can depend on routing transients, this for example
>>> applies to codes 0, 1 and 5. Other messages that appear to indicate
>>> non-functioning path such as Type=11 (Time Exceeded) with code=0 (Time
>>>to
>>> Live exceeded in Transit) are not clear indicator as the IP packets
>>> potentially can reach the destination with a larger TTL value set at
>>> transmission. Therefore, implementation needs to analyse the different
>>> ICMP messages types and codes for which it considers the path as
>>> non-functioning. If the agent supports processing of ICMP errors, and
>>>if a
>>>     Binging request generates an ICMP error, the agent SHOULD set the
>>>     state of the candidate pair to Failed.
>>>
>>>
>>> What also is not addressed in this is the risk of denial of service
>>> attacks using spoofed ICMP messages to shutdown certain connectivity
>>> checks. The security considerations lack any discussion of this issue.
>>> If ICMP processing are retained, I think a recommendation about
>>> validation is needed to avoid at least off path attackers from doing
>>> these attacks easily. Unfortunately ICMP response will only include the
>>> IP/UDP header, thus no data from the STUN messages which would allow
>>> verification that the ICMP messages matches an actually sent check.
>>>
>>> It may be simplest to recommend against reacting to ICMP errors from
>>> both the perspective that it is a risk for denial of service attack, as
>>> well as that it represents a risk terminating connectivity checks for a
>>> transient issue. From my perspective I expect this to reduce the number
>>> of sent connectivity checks very little
>> So, are you saying that an agent should simply ignore ICMP messages?
>
>Yes, I think that may be the best. There are a bit to many corner cases
>and significant attack surface that getting all the details right are
>significant work for a relatively small reduction in sent connection
>check messages.

I am not an expert, but aren’t you going to get ICMP unreachable every
time you try to contact a host candidate behind a NAT? Are you saying that
the agent should ignore it, as the connectivity test will anyway timeout
at some point?

Also, note that RFC 5398 (STUN) says:

  "A STUN transaction over UDP is also considered failed if there has been
a
   hard ICMP error [RFC1122].”

I am a little worried that people would have to use ICE-specific STUN
stacks if they are required to ignore ICMP errors.

Couldn’t we simply use the existing text, and add a sentence about DoS
attacks?

OLD:

 An ICE agent MAY support processing of ICMP errors for connectivity
 checks.  If the agent supports processing of ICMP errors, and if a
 Binging request generates an ICMP error, the agent SHOULD set the
 state of the candidate pair to Failed.

NEW:

 An ICE agent MAY support processing of ICMP errors for connectivity
 checks. If the agent supports processing of ICMP errors, and if a
 Binging request generates an ICMP error, the agent SHOULD set the
 state of the candidate pair to Failed. Implementers need to be aware
that ICMP errors can be used as a method for denial of service attacks
when making a decision on how and if to process ICMP errors.

---

>>>>>H. Section 14.3:
>>>>>
>>>>>       Num-Waiting: the number of checks in the check list in the
>>>>>       Waiting state.
>>>>>
>>>>>       Num-In-Progress: the number of checks in the In-Progress state.
>>>>>
>>>>> Is "the number of checks" only per single checklist or across all the
>>>>> check
>>>>> lists?
>>>> Per single check list.
>>> Section 14.1 says: "Those formulas scale
>>>     with N, the number of checks to be performed."
>>>
>>> But, from my perspective, that number N is not on a single checklist,
>>> but for the whole ICE session, i.e. all the checklists. Thus, I think
>>> there needs a clarification on why this is per checklist, making it
>>> clear why these two things match up.
>> Actually, I think I was wrong. It is per check list SET, i.e., all check
>> lists.
>
>So, please correct the language to be clear on that it is across the
>whole check list set.

I will s/check list/check list set.

---

>>>>>I. Section 17.2.3:
>>>>>
>>>>> When VAD is being
>>>>>     used, keepalives will be sent during silence periods.
>>>>>
>>>>> I would claim that this is only true for when VAD without any comfort
>>>>> noise is
>>>>> used. A lot of codecs with VAD operations still generates comfort
>>>>>noise
>>>>> on a
>>>>> frequency of a couple packets a second, way more often then the
>>>>>minimal
>>>>> for ICE
>>>>> keep-alives.
>>>> What about:
>>>>
>>>> "In deployments that are not utilizing Voice Activity Detection (VAD),
>>>> without any comfort noise,
>>>> the keepalives are never..."
>>> That is also not true. As keep-alives is not expected to be needed for:
>>> 1. Continous media withouth any VAD etc.
>>> 2. Media with VAD, but with comfort noise not allowing intervals to be
>>> to long (<=1 second).
>> What about:
>>
>> OLD:
>>
>> In deployments that are not utilizing Voice Activity Detection (VAD),
>>the
>> keepalives are never
>>     used and there is no increase in bandwidth usage.  When VAD is being
>>     used, keepalives will be sent during silence periods.  This involves
>>     a single packet every 15-20 seconds, far less than the packet every
>>     20-30 ms that is sent when there is voice.  Therefore, keepalives
>>     don't have any real impact on capacity planning.
>>
>>
>>
>> NEW:
>>
>> In deployments with continuous media and without utilizing Voice
>>Activity
>> Detection (VAD),
>> or deployments where VAD is utilized together with short interval (max 1
>> second) comfort noise,
>> the keepalives are never used and there is no increase in bandwidth
>>usage.
>>   When VAD is being
>>     Used without comfort noise, keepalives will be sent during silence
>> periods.  This involves
>>     a single packet every 15-20 seconds, far less than the packet every
>> 20-30 ms that is sent when
>> there is voice.  Therefore, keepalives don't have any real impact on
>> capacity planning.
>
>Looks good.

I will modify the text as above.

=======

Minor/Editorial Issues:

 ...

>>>>> 7. Section 7.3:
>>>>>
>>>>> If the agent is using Diffserv Codepoint markings [RFC2475] in its
>>>>>      data packets, it SHOULD apply the same markings to Binding
>>>>> responses.
>>>>>
>>>>> I find this sentence a bit unclear. Is it intended to say:
>>>>>
>>>>> If the agent receiving the binding request, intended to use DSCP
>>>>> markings
>>>>> !=0
>>>>> for the data, it SHOULD set, the same marking to binding responses.
>>>>>
>>>>> or
>>>>>
>>>>> If the agent receives a binding request with DSCP markings, then it
>>>>> should
>>>>> apply to corresponding code point when forming the binding response?
>>>> It means that it will use the same markings in Binding responses that
>>>>it
>>>> uses in data packets (audio, video, etc).
>>> Okay, I wonder if it is the temporal aspect of the sentence that makes
>>> it harder for me to parse correctly.
>>>>> There are unclarity of which agent is referenced and whom "it" is in
>>>>>the
>>>>> sentence.
>>> It is the STUN server.
>>>
>>> Would the following be more clear?
>>>
>>> "If the agent is using Diffserv Codepoint markings [RFC2475] in data
>>> packets that it sends, the agent SHOULD apply the same markings to
>>>Binding
>>> responses."
>>>
>>> Yes, except that I stumbles "it sends" where it is more likely "it will
>>> send" is the correct form.
>> "If the agent is using Diffserv Codepoint markings [RFC2475] in data
>> packets that it will send, the agent SHOULD apply the same markings to
>> Binding
>> Responses that it will send."
>
>Yes. works for me.

I will modify the text as above.

---

>>>>> 8. Section 8.3.1:
>>>>>
>>>>>     The procedures in Section 8 require that an ICE agent continue to
>>>>>     listen for STUN requests and continue to generate triggered
>>>>>checks
>>>>>     for a data stream, even once processing for that stream
>>>>>completes.
>>>>>
>>>>> That reference to Section 8, should that in fact be to Section 8.1
>>>>> specifically? It looks strange with a self reference, which in some
>>>>> aspect a
>>>>> reference to section 8 means.
>>> I think this issue was missed.
>> Sorry for that.
>>
>> The text can be removed, because it refers to text further down in the
>> same subsection (first sentence of second paragraph).
>
>Yes, removing that sentence works for me.

Will be removed.

---

>>>>>9. Section 15:
>>>>>
>>>>> 4.57566E+18 (note that
>>>>>     an implementation would represent this as a 64-bit integer so as
>>>>>not
>>>>>     to lose precision).
>>>>>
>>>>> Why the floating point representation? Priorities are integer numbers
>>>>> and
>>>>> thus
>>>>> should be presented as such in this example.
>>>> This is from RFC 5245, and unfortunately I don¹t know.
>>> Can you not just calculate the 64-bit integer and write it out?
>> So, you want me to write 4575660000000000000?
>
>No, I thought the pair priority will not be 0 for the lower 32 bits and
>that there actually are overflow in this deduction. My understanding is
>that what should be written here is the calculation of:
>
>pair priority = 2^32*MIN(G,D) + 2*MAX(G,D) + (G>D?1:0)
>
>Where G and D are the priority values for $L_PRIV_1 and $R_PUB_1.
>
>$L_PRIV_1 is L's host candidate, and stated to have a priorty of
>2130706431
>
>$R_PUB_1 is R's host candidate and stated to have a priorty of 2130706431
>
>So G = 2130706431 and D=2130706431
>
>pair priority then becomes 2^32*2130706431 + 2*2130706431 + 0 =
>9151314442783293438
>
>Thus, I think the next two pair priorities in the example also needs to
>be fixed.
>
>The first pair priority for R is the same value as for L, as G and D are
>identical. But the next value will
>be different. To my understanding L will be controlling, i.e. L's
>candidate will be G
>
>G=1694498815
>D= 2130706431
>
>Pair priority = 2^32*1694498815+2*2130706431 + 0 = 7277816997797167102

Please provide text for the section (or, at least the modified paragraphs)
:)

Regards,

Christer
<<attachment: smime.p7s>>