Re: [Last-Call] Artart last call review of draft-ietf-calext-jscontact-07

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I'm still not convinced that there is something Bidi-specific to add to this specification, other than what Unicode standard and the current specification already define.

I tried coming up with a paragraph that includes the recommendations outlined in your email. They all came out looking like they would emphasize select requirements of the Unicode specifications, making it unclear why mentioning these and not others is required.

On the point of requiring Unicode Bidi_Control closing tags: Unicode Standard Annex #9 already defines clearly what do with string values that do not embed balanced PDF or PDI characters. Certainly we can reiterate the requirements of the Unicode Bidirectional Algorithm, but that feels kind of out of place to me? Would every RFC dealing with Unicode text do that?

On the point of directionality across string values: the "components" list properties of the Name and Address objects are defined as "components SHOULD be ordered such that their values joined as a String produce a valid full name/address of this entity. If so, implementations MUST set the isOrdered property value to "true"." That's all there is to define preference for a specific order of name or address components. How to render these components is out of scope of this specification.

That's not to say that I'm generally opposed to adding BiDi-related information to this spec. I just still don't feel like the current definitions are missing something, or conversely that the BiDi-points raised give a full picture of what might need to be addressed because it's missing in the underlying Unicode spec.

Regards,
Robert


On Sun, Aug 20, 2023, at 6:02 PM, Martin J. Dürst wrote:
Sorry for not replying sooner. Currently offline, so I can't check the 
actual wording.

On 2023-07-04 16:27, Robert Stepanek wrote:
> On Tue, Jul 4, 2023, at 8:52 AM, Carsten Bormann wrote:
>> On 4. Jul 2023, at 08:47, Robert Stepanek <rsto@xxxxxxxxxxxxxxxx> wrote:
>>>
>>> The "dir" attribute just contains the equivalent markup for Unicode sequences such as RLI ... PDI. Any text in JSContact is a UTF-8 encoded string which can contain the Unicode Bidi_Control code points, so there is no need for markup.
>>
>> That is certainly one way to do this.
>> So you assume all RTL text contains these sequences?  You maybe should say so.

> I do not assume that, it's up to implementations if they add those sequences and render them accordingly.

Implementations are expected to interoperate, so we need to be a bit 
more precise.

For the individual text pieces (e.g. a surname or the street part of an 
address), the above is true. But the spec should say very explicitly 
that for every opening tag, there should also be a corresponding closing 
tag. And the spec should also say that for unidirectional text pieces 
(e.g. a give name only with Arabic or only with Hebrew letters (plus 
neutrals such as spaces inbetween)) no Bidi control characters are needed.

> That's not different than with any other Unicode sequences. We only added a reference to TR-9 because we got asked during review about bidirectional text. Maybe explicitly mentioning this part of Unicode brings up more confusion, and we might rather highlight instead that text may be any valid UTF-8 encoded Unicode.

>> How do these sequences compose, e.g., when building a name from its components?

> The document recommends to concatenate the components string values in order. That should work fine with properly beginning and ending formatting characters.

"Concatenate the components string values in order" works well for the 
internal logical storage. But for display, the question is which order. 
LTR or RTL? The right answer may be a mixture. If you have several 
pieces that are RTL, it's easier to read them in RTL order. Same for 
several pieces in LTR order. But if you have an RTL component at the end 
of the name part, and another RTL component at the start of the address 
part, and you display the whole thing inline, it's unclear whether the 
reader wants the name pieces and the address pieces clearly separated 
(leading to more 'jumps' of the reading sequence) or wants subsequent 
pieces that read the same way in the respective order independent of 
whether that visually interleaves e.g. name and address parts.

We had very similar questions when discussing the display of bidi IRIs 
(the bidi solution in RFC 3987 is just one way of doing things, not 
necessarily the preferred one for all users and all kinds of sequences).

And for names/addresses, the addressee may also have a preference (i.e. 
"I write my name with the components LTR" or "I write my name with the 
components RTL". But not sure if mixed direction (e.g. mixed script) 
names are actually "a thing" in the relevant regions. If not, or "not 
really", then it might be worth to put something into the spec to 
recommend avoidance of mixed-direction data (I don't remember the 
details, but it would be okay to have e.g. an English and a Hebrew 
locale, but the English locale should be all Latin/LTR, and the Hebrew 
locale should be all Hebrew (script)/RTL. In that case, neither DIR 
attributes nor bidi control characters would be needed.

Regards,   Martin.


-- 
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call

[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Mhonarc]     [Fedora Users]

  Powered by Linux