Hi Martin,
thanks again for your feedback (also your private one when we asked about more info on pronunciation).
Before I get back to all your remarks, could you please once again recheck our proposal for upcoming draft version 12 for support of Japanese addresses? By example of the address of the Tokyo Post Office:
"addresses": { "k26": { "fullAddress": "2-7-2 Marunouchi, Chiyoda-ku, Tokyo\n100-8994\nJapan", "street": [ { "kind": "block", "value": "2" }, { "kind": "separator", "value": "-" }, { "kind": "building", "value": "7 }, { "kind": "separator", "value": "-" }, { "kind": "number", "value": "2" }, { "kind": "separator", "value": " " }, { "kind": "subdistrict", "value": "Marunouchi" }, { "kind": "district", "value": "Chiyoda-ku" }, { "kind": "locality", "value": "Tokyo" }, ], "postcode": "100-8994", "country": "Japan", "defaultSeparator": ", " } }, "localizations": { "jp": { "k26": { "fullAddress": "〒100-8994\n東京都千代田区丸ノ内2-7-2", "street": [ { "kind": "locality", "value": "東京" }, { "kind": "separator", "value": "都" }, { "kind": "district", "value": "千代田区" }, { "kind": "subdistrict", "value": "丸ノ内" }, { "kind": "block", "value": "2" }, { "kind": "building", "value": "7" }, { "kind": "number", "value": "2" } ], "postcode": "〒100-8994", "defaultSeparator": "" } } }
Would "都" better be part of locality, such as "東京都" ?
Best regards,
Robert
On Thu, Jun 22, 2023, at 9:39 AM, Martin J. Dürst wrote:
Hello Robert, everybody,On 2023-06-02 22:11, Robert Stepanek wrote:> Thanks for your review. We now published version 11 which includes much of your feedback. However, at this point we do not plan to take action on what you describe as major issues. Could you please see our questions below so that we can better understand your concerns?>> On Fri, Apr 21, 2023, at 11:43 AM, Martin Dürst via Datatracker wrote:>> Reviewer: Martin Dürst>> Review result: Not Ready>>>> Summary: The document isn't ready for publication.>>>> [This is essentially a *really* hard problem (*). If some of the issues>> raised below are not addressed, they should at least be clearly>> documented.>>>> (*) I know several people close to the Unicode consortium who have worked on>> these issues; they essentially never thought they were done :-(]>>>> [version reviewed: mostly -07, to some extent checked against -10]>>>>> Major issues:>>>> - The format uses @type (and probably @version) in a way very similar>> to JSON-LD (to the extent that somebody at IETF 116 told me it was>> JSON-LD), but at least the fact that @context is missing seems to>> strongly indicate that it's not JSON-LD. The idiosyncratic way>> data is arranged in the format, often with way more levels than>> what a straightforward design might produce, would be much easier>> to swallow if the document clearly indicated what kind of general>> conventions it used, and how these conventions were similar and>> different from more well-known conventions (such as JSON-LD).>> Of course, even better that just documentation would be to fix>> things so that the format isn't idiosyncratic, but uses well>> established and documented conventions.>> We find it difficult to take action on these major issue:>> Which specific parts of the data model use too many levels? How would a straight-forward design for these look like?As a simple example, why"name": {"components": [{"kind": "given","value": "John"}, {"kind": "surname","value": "Doe"}]}and not just"name": [ { "given": "John" }, { "surname": "Doe" } ]or maybe"name": {"components: [ { "given": "John" }, { "surname": "Doe" } ]}> How is JSON-LD, a Linked Data format, relevant in the context of JSContact?At IETF 116, I asked around to find somebody who might be able toexplain the design decisions re. JSON in this draft, and somebody(sorry, forgot who that was) mentioned JSON-LD. Then I had a quick lookat JSON-LD, and found a @type attribute, so I thought there was aconnection. But as I say above, it's not exactly JSON-LD.In may long-time experience with standards, something that takes part ofsomething more basic, but not all of it, is in most cases a bad idea. Ifthe reason for the way JSContact data is structured is "we just made upthat structure" or "we took some ideas from here and there, but don'tremember that anymore", then that's not a good sign. If there is a morecoherent explanation for your choices, it would be good to give thatsomewhere.The advantage of JSON-LD would of course be that your format would beclearly related to other technology. Using @context would ground yourwork with URIs and make it easier to mix information from JSContact withother data.The fact that you have made @type optional may help here, because itremoves associations with JSON-LD. Something clearly not JSON-LD may bebetter that something almost JSON-LD, although something really JSON-LDmay still be better.>> - It says (at the start of Section 1) that this is an alternative to>> vCard (and xCard and jCard). It should explain more clearly (assuming)>> e.g. that the underlying format is JSON in what cases jCard should be>> chosen and in what cases JSContact. Just defining "yet another format">> doesn't make sense.>> Version 10 already addresses this in section 1.1.>>>>> - I'm not usually doing this, but by chance, I read the Gen-ART review>> for this document. I fully support it. In particular with respect to>> legal vs. preferred names, there's also the example of researchers>> preferring to use their maiden name in an academic context, and there>> are cases of people with multiple nationalities that may have>> different names in each nationality because of legal requirements>> (the last case is orthogonal to the locale/script issue).>> Version 10 defines the name property as "This can be any type of name, e.g. it can but need not be the legal name of a person".>> We chose not to support multiple names in order to stay compatible with vCard in this matter. Would it help to have one "main" name and multiple "alternative" names?You already have multiple locales. People could give different names inthe different locales. In some cases, that would be exactly the rightthing, but it would be a bad idea to have to use locales to 'cheat' inorder to be able to represent multiple names.>> - In Japanese, it is very important to not only have the name itself>> (usually in Kanji), but also its pronunciation. Same for addresses.>> Some names (e.g. 田中/Tanaka) are read without problems by anybody>> in Japan, but there are others which are essentially impossible to>> read without separate information. The spec should clearly indicate>> how pronunciation for names and addresses is indicated to cover this.>> Such information is given on most forms, and exists in most databases.>> Thanks for highlighting this. We are working on a proposal.(see my separate (private) mail)>> - Some (names or) addresses in the Near East (Arabic/Hebrew/... script)>> may contain data of mixed directionality (right-to-left as well as>> left-to-right). The document contains absolutely no information about>> how to deal with such issues.>> The document defines "locale" as top-level property to indicate a default locale for text contained in the Card, as well as the "localizations" property for specific text elements. We consider supporting multiple locales within a single String property out of scope.This is NOT about supporting multiple locales within a single String.It's about bidirectionality. The fact that a name or address, or acomponent thereof, is in a certain locale doesn't guarantee that all thecharacters in that piece have the same directionality.>> - The way a name (and some other information) can be composed of>> components, together with extensibility, provides a lot of mileage>> to deal with the very wide variety of name components and formats.>> However, there are several issues:>> 1) Reuse where it's only halfway appropriate.>> In an example in Figure 31, the document uses "type": "middle">> for a Russian patronymic. This seems to be based on the>> interpretation that the patronymic is "kind of like a>> middle name". But it's only "kind of". A patronymic wouldn't>> be initialized, whereas a middle name e.g. in the US is extremely>> frequently only given as an initial.>> 2) Definition by example: Figure 31 is only an example. Does it>> mean Russian patronymics should be labeled as "type": "middle",>> or what else?>> Thanks. We now updated the example, such that Mikhailovich is of kind "surname".If you got some highly reliable information from some other source thatsays that Russian patronymics should be treated like surnames, then thatmay be fine. Otherwise, turning back to middle name may be better. Theproblem is that a Russian patronymic is neither a US middle name nor asurname.My comment was on a higher level: a) How are you going to make sure thatfor those cases where you have an example in your spec (i.e. the aboveRussian patronymic), people follow your spec (even though it's just anexample). b) How are people in cultures where there are no examples inyour spec going to figure out which parts of their local namingconventions correspond to which pieces in the spec, in a way thatresults in an uniform use of pieces for components in that locale (andhopefully locales with equivalent phenomena).>> 3) Extensibility will be needed for many countries and cultures,>> but most of these are not used to proactively register things>> with IANA, because they may assume they have to fit into the>> base scheme, or because they do not understand the value of>> such registrations.>> The format allows for non-standard extensions, at the cost of interoperability. XXX care to clearly outline in the document how to register new standard items for JSContact.>>> 4) Depending on culture and language, there are many different>> ways to address or refer to a person.>> Other than speakToAs, we consider this out of scope.>>> - When names,... are composed, the default is to use a space as>> a separator. There are many scripts (Chinese/Japanese/Korean/>> Thai/...) where words, and therefore (at least in running text)>> name components are not separated. In the current design (as I>> understand it), that would mean to add separator fields>> between every pair of field. It would be good to have something>> like a "default separator" to not have to repeat one and the>> same separator several times.>> We now added a defaultSeparator property which defines what characters to insert between name component values.Thanks!>> - There are many examples for parts of the specification, but no>> overall example.>> Once the specification is complete, we'll add one to the Appendix.Thanks!>>>> Details:>>>> Introduction:>> "The attributes of the card data represented must be described as a simple>> key-value pair, reducing complexity of its representation." -> "The attributes>> of the card data represented must be described as simple key-value pairs,>> reducing complexity of their representation.">> Done>>>>> 1.9.1: What about case sensitivity? ABNF is case insensitive, but>> as far as I understand, JSON object keys are case sensitive.>> The property names are case-sensitive.So in"kind": "given","value": "John"both "kind" and "value" are case sensitive because of JSON, and becauseof what your spec says about member names in JSON objects (1.3). Butwhat about "given"? Would "Given" or "GIVEN" be okay? Where does thespec say so?>> Figure 1: Why does the ABNF syntax just above not need a figure number,>> but then all the examples need one? Labeling text as "Figure" looks>> weird, "Example" would be better, but is probably also not needed.>> We wrap artwork and sourcecode tags in figure as it allows us easily to refer to them. We forgot to do that in this case, it's now fixed. We might get rid also of all figure tags, I haven't checked how that will turn out, but at least it now is consistent.>>>>> 2. Card: This starts without any introductory sentence whatever.>> Such a sentence should be added. It's also unclear to me why this>> specification uses the term "card" when the title uses the word>> "contact" twice, but never card. It might be better to change this>> to "contact".>> Added introductory sentence. We will keep the name "Card", as it succinctly names the item that stores contact information, such as a business card. It is also line with vCard.>>>>> The mime type says "application/jscontact+json;type=card".>> It's unclear why "type=card" is needed. The only thing contained>> in the jscontact spec are cards, so application/jscontact+json>> should be enough.>> We keep the "type" parameter but make it optional.>> Previous versions of this I-D had types Card and CardGroup, but the latter got removed. We still want to keep this extensible and want implementations be prepared for that.Would you expect CardGroup documents (or any future new kinds ofdocuments) to be handled to different applications than Card documents?Or would you expect applications to not be able to distinguish betweenthese different kinds of documents?If not, what purpose does the parameter serve?>> 2.1.5 locale, and 2.7.1: It may often be the case that a single>> set of data could be suited for more than one locale, but this>> cannot be expressed currently.>>>> The spec forces one of the locales to be the 'main' locale, the others to be>> localizations. This is quite in contrast to most other parts, where>> alternatives are treated on an equal footing, maybe with some preference>> indication. Why this inbalance? It may be inappropriate for some applications>> or users. (what if there's a requirement to treat different localizations as>> equivalent?)>> A localization need not override all properties, it can just override a subset.Looking at the algorithm again, that's indeed the case. Maybe it wouldbe good to have an example where that actually happens (both Fig. 36 and37 have 'complete' localizations).> Any property values that are not localized are available across all localized variants of the Card. Likewise, a localization can remove a property without setting a new value, if the property value in the main Card would not be appropriate for that locale.>>>>> 2.1.6: Using 'true' values rather than simply an array of UUIDs>> seems somewhat abstruse. Where does this kind of stuff come from?>> JSContact, and RFC 8984, are designed to work well with the JMAP protocol (RFC 8620), which does not allow to remove single items of an array. A JSON set is the idiom in this context.If the JSON spec defines sets, then that should be referenced. Ifanother spec defines JSON sets, then that should be referenced, even ifjust informally.>> 2.1.7: Why does this use SGML syntax? Is that mandatory? Say what>> values are allowed here and what not.>> The example uses SGML syntax because it matches the vCard PRODID property example. I now replaced the example with a free-text identifier. I made clear that this may be any non-empty string value.>>>>> 2.2.1: Probably due to xml2rfc or some other software, this has>> double spaces after periods where very clearly, there should be>> only one period ("Mr. John Q. Public, Esq.").>> Fixed.>>>>> 2.2.4, organizations: The example in Figure 15 has two units.>> Is the order of the units outside in or inside out? Or is this>> an example for a matrix organization?>> Fixed. The section now states that the list units is ordered descending by hierarchy (e.g. a geographic or functional division sorts before a department within that division)>>> 2.3.2: Why do 'impp' and 'uri' have to be distinguished? This>> should be clear from the URI scheme in the "user" field.>> Fixed. This was for compatibility with vCard, but we found a better way to do that now.>>>>> 2.3.3: "cell": Please change this to "mobile", which is way more>> popular according to Google ("cell" really sounds antiquated to>> me, but your mileage may vary).>> Fixed>>>>> 2.3.4: "preferredContactChannels" and "ContactChannelPreference">> seem to be a waste of bytes (but only the most egregious out of>> many).>> Right, we renamed that to "contactBy". The @type properties now are optional and can be omitted in almost all cases. This considerably reduces the byte size of JSContact data.>>>>> 2.5.1: "street": There are countries (in particular Japan) that>> do not use street addresses, but a more hierarchical block-based>> system. The spec should say that the "street" field includes such>> cases, or should explain how to denote them.>> Version 11 nows add additional street component types that allow for such addresses. We also added more international addresses to demonstrate this.I'm surprised you use "Irgendwostrasse" (would be okay in Switzerland)and not "Irgendwostraße" (the correct form in Germany and Austria).As for "3 Chome-7-1 Nishishinjuku, Shinjuku City, Tokyo, Japan",{ "kind": "apartment", "value": "3" } is definitely wrong.More compact, this is 3-7-1 Nishishinjuku, Shinjuku City, Tokyo, JapanMore expanded, this is 3 Chome 7 Ban 1 Go Nishishinjuku, Shinjuku City,Tokyo, JapanIn Japanese (fully outside-in), it is 東京都新宿区西新宿3-7-1 or東京都新宿区西新宿3丁目7-1 or 東京都新宿区西新宿3丁目7番1号。3 is the number of a rather big block (including two Hotels and TokyoOpera City), not an apartment number.In general (there are some exceptions), Japanese addresses have threenumbers for the smallest blocks (which may be a single apartmentbuilding or a few small single-family houses; if there's an apartmentnumber, it's usually the fourth number.Regards, Martin.>> Why are separators ignoreable in "street"?>> We forgot to remove this when we removed unnecessary MAYs from the document. Implementations are free to generate the address however they deem appropriate.>>>>> In Figure 25, why are numbers given before names in the fullAddress>> field, but after in the StreetComponents?>> Fixed.>>>>> 2.6.3: Why are there no "kind"s for blogs, web pages,...?>> We chose the minimal set of required kinds for interoperability with vCard. We encourage to register additional kinds at IANA.>>>>> 2.6.4: "The resource is a photograph or avatar." ->>> "The resource is a photograph of the person or picture of their (one of their)>> avatar(s).": In my understanding, a jpeg file isn't an avatar, but just a>> rendition of an avatar. An avatar may be 3-dimensional, or have various>> different renderings,...>> I think the rephrased section in version 10 already meets your concern. Agreed?>>> "graphic image or logo associated with entity" ->>> "graphic image or logo associated with the entity">> Fixed.>>>>> 2.7.1: "a localized Card SHOULD NOT contain more information than its>> non-localized variant": Also say that the information shouldn't be different.>> On top of that the "localizations" structure is part of the card, so the term>> "localized Card" doesn't seem appropriate here. (It would be appropriate for a>> separate card that is a localized version.)>> Version 10 already addressed this somewhat, but I rephrased again for clarity: "This localizes property values in this Card to languages other than the main locale. Localizations provide locale-specific alternatives for existing property values and SHOULD NOT add new properties.">>>>> Figure 31: What is the notation used in "addresses/addr1/locality"?>> I assume a path indicating what to patch. But then, Figure 32 doesn't>> use this syntax. Why not?>> The localizations property contains values of type PatchObject. This type is defined in section 1.5.3.>>>>> 2.8.1, kind: "This RFC defines a small set of common anniversary types,>> additional types MAY be registered at IANA (Section 4.6.2)": Don't talk about>> types when you label them "kind". Also, the language for extension by RFC or>> registration or private use is not consistent throughout the spec. If there's>> one single way of doing extensions (i.e. all extension points allow definition>> by additional RFCs and IANA registrations and private stuff), then clearly say>> so somewhere, and define a short term for this kind of extensibility. If there>> are two or three different ways to do this (e.g. some places, private>> extensions are allowed, but others not), then again define the various>> categories in a single place and then use the defined terms.>> Fixed>>>>> "Note that for calendar systems with leap months, the year property might be>> required to convert between the Gregorian calendar date and the respective>> calendar system." This is not limited to calendar systems with leap months. It>> would be the case for a calendar with 12 months of 30 days each, too.>> Fixed>>>>> 2.8.2, keywords: See above at 2.1.6.>> Same rationale as for "members" property applies.>>> 2.8.4: "This is free-text, but future specifications MAY restrict allowed>> values depending on the type of this PersonalInfo.": It should be made clear>> that such restrictions will not be applied to the currently defined kinds>> (expertise, hobby, interest). Otherwise, we have a compatibility problem.>> We removed this superfluous sentence.>>>>> 3. "status of known implementations of the protocol": This is not a>> protocol, but a format. The fact that only one implementation seems to exist,>> and only in alpha, doesn't necessarily support moving this spec forward quickly.>> We added the Cyrus IMAP server implementation. There are a couple of implementors waiting for this RFC to be published before they implement JSContact.>>>>> For fields that say "this document", replace with "RFC XXXX".>> Done for IANA Considerations section.>>>>> Shortly before 4.3.1: "check it is coherent" -> "check whether it is coherent">> Fixed.>>>>> Both Table 3 and table 4 have the same title, but totally different content.>> Please check.>> Fixed.>>>>> Security Considerations:>>>> Probably worth mentioning that data should only be collected and distributed on>> a need-to-know basis.>>>> "JSON uses opening and closing tags for several types and structures">> It's the first time I have seen {, }, [, and ] being called "tags".>> Fixed with "brackets".>>> " Since JSON does not use explicit string lengths, the risk of denial of>> service due to resource exhaustion is small": Not sure about this. It all>> depends on the implementation. An implementation may believe a large string>> length, or it may allocate a large buffer just in case because it doesn't have>> any information about string length.>> This is a verbatim copy of the Security Considerations in RFCs 8620 and 8984. The second part of that sentence in version 10 already addresses your concerns: "[...] but implementations may still wish to place limits on the size of allocations they are willing to make in any given context, to avoid untrusted data causing excessive memory allocation."
-- last-call mailing list last-call@xxxxxxxx https://www.ietf.org/mailman/listinfo/last-call