Protocol Action: 'Negotiating Human Language in Real-Time Communications' to Proposed Standard (draft-ietf-slim-negotiating-human-language-24.txt)

The IESG <iesg-secretary@ietf.org> · Tue, 20 Feb 2018 10:10:23 -0800

The IESG has approved the following document:
- 'Negotiating Human Language in Real-Time Communications'
  (draft-ietf-slim-negotiating-human-language-24.txt) as Proposed Standard

This document is the product of the Selection of Language for Internet Media
Working Group.

The IESG contact persons are Adam Roach, Alexey Melnikov and Ben Campbell.

A URL of this Internet Draft is:
https://datatracker.ietf.org/doc/draft-ietf-slim-negotiating-human-language/

Technical Summary:

   In establishing a multi-media communications session, it can be
   important to ensure that the caller's language and media needs
   match the capabilities of the called party.  This is important
   in non-emergency uses (such as when calling a company call
   center) or in emergencies where a call can be handled by a
   call taker capable of communicating with the user, or a
   translator or relay operator can be bridged into the call
   during setup.

   This document describes the problem of negotiating human
   (natural) language needs, abilities and preferences
   in spoken, written and signed languages.  It also provides
   a solution using new stream attributes within the Session
   Description Protocol (SDP).    

Working Group Summary:

This draft has undergone 13 revisions since its initial IETF last call (which occurred on draft -06).  These
revisions were required to address issues raised by the IETF community, such as: 

1. The meaning of the "*" in language negotiation. The SDP directorate review in the initial IETF last call expressed concern
over the handling of the asterisk, which had the properties of a session attribute while being included within individual m-lines.
WG consensus was to remove the asterisk, whose role had been advisory.

2. Routing of calls.  The SDP directorate review in the initial IETF last call expressed concern about whether the document
intended the use of SDP for routing of SIP traffic.  Language was added to indicate clearly that call routing was
out of scope.

3. Combining of hlang-send/hlang-recv. In IETF last call, a reviewer suggested that the document allow combining the
hlang-send and recv indications so as to allow more efficient representation in cases where language preference is
symmetrical. This suggestion was not accepted by the WG since it was not clear that the efficiency was worth the
additional complexity.  

In addition to issues brought up in IETF last call, there was substantial WG discussion on the following points: 

4. Undefined language/modality combinations. Language tags do not always distinguish spoken from written
language, so some combinations of languages and media are not well defined. The text in Section 5.4 
resulted from WG discussion of several scenarios:

    a. Captioning. While the document supports negotiation of sign language in a video stream, it does not 
    define how to indicate that captioning (e.g. placement of text within the video stream) is desired.
    WG Consensus did not support use of suppressed script tags for this purpose.

    b. SignWriting (communicating sign language in written form). Currently only a single language tag has been defined
    for SignWriting so that written communication of sign language in a text stream (or in captioning) is also not
    defined.

    c. Lipreading (spoken language within video). There was not WG consensus for explicitly indicating
    the desire for spoken language in a video stream (e.g. by use of the -Zxxx script subtag), since the
    ability to negotiate "lip sync" is already provided in RFC 5888.   

As a result of these discussions, Section 5.4 leaves a number of potential combinations of language and
media undefined.  Assuming that implementation experience shows a need to define these scenarios, they
can be addressed in future work. 

5. Preferences between media.  As an example, an individual might be able to understand written English communicated using
Realtime Text, but might prefer spoken English audio.  The current draft enables all modes of communication to be
negotiated, but does not indicate a preference between them.  WG consensus was that it was acceptable and 
possibly more reliable for mutually supported media to be negotiated and brought up, then let the conversants 
decide which media to use, rather than taking on the additional complexity of negotiating media preference beforehand.

During discussion, it was pointed out that quality issues could influence media preferences during a call.  
For example, on a call where audio, video and text are all available, sending video may interfere with
audio quality so that video sending needs to be disabled.  Alternatively, audio quality could be poor so that
the conversants need to resort to text.  So media quality issues can negate the "best laid plans" of 
media preference negotiation.

Document Quality:

  There are no current implementations of draft-ietf-slim-negotiating-language.  However, the North American Emergency Number Association 
  (NENA) has referenced it in NENA 08-01 (i3 Stage 3 version 2) in describing attributes of emergency calls presented to an ESInet and 
  within 3GPP some CRs introduced in SA1 have referenced the functionality.  Therefore implementation is expected.

Personnel:

  Bernard Aboba is the Document Shepard.  The responsible area director is Alexey Melnikov.