Reviewer: Dale Worley Review result: Ready with Issues I am the assigned Gen-ART reviewer for this draft. The General Area Review Team (Gen-ART) reviews all IETF documents being processed by the IESG for the IETF Chair. Please treat these comments just like any other last call comments. For more information, please see the FAQ at <https://wiki.ietf.org/en/group/gen/GenArtFAQ>. Document: draft-ietf-httpbis-rfc6265bis Reviewer: Dale R. Worley Review Date: IETF LC End Date: 2025-02-11 IESG Telechat date: 2025-02-20 Summary: This draft is on the right track but has open issues, described in the review. Technical issues: * Whitespace at the end of a cookie value when parsing a cookie string Sec. 4.1.1 says: set-cookie-string = BWS cookie-pair *( BWS ";" OWS cookie-av ) cookie-av = expires-av / max-age-av / domain-av / path-av / secure-av / httponly-av / samesite-av / extension-av extension-av = *av-octet av-octet = %x20-3A / %x3C-7E ; any CHAR except CTLs or ";" This is ambiguous for parsing extension-av. E.g. Set-Cookie: name=value;attr1= v a l u e ;attr2=x Does the value of attr1 start with "v" or with " "? Does it end with "e" or with " "? RFC 9110 sec. 5.6.3 says that BWS can be deleted by intermediate processes, so presence of this trailing whitespace can't be relied upon, which suggests that attribute values should never end with whitespace. Looking at sec. 5.6 step 5 5. Remove any leading or trailing WSP characters from the attribute- name string and the attribute-value string. shows that the intention is that the attribute name and value should have no leading or trailing whitespace. That would be better handled by being explicit in the grammar: extension-av = ext-av-name / ext-av-name "=" ext-av-value ext-av-name = null / av-name-non-wsp / av-name-non-wsp *av-name-octet av-name-non-wsp av-name-non-wsp = %x21-3A / %x3C / %x3E-7E ; any CHAR except CTLs, " ", ";", or "=" av-name-octet = %x20-3A / %x3C / %x3E-7E ; any CHAR except CTLs, ";", or "=" ext-av-value = null / av-value-non-wsp / av-value-non-wsp *av-value-octet av-value-non-wsp av-value-non-wsp = %x21-3A / %x3C-7E ; any CHAR except CTLs, " ", or ";" av-value-octet = %x20-3A / %x3C-7E ; any CHAR except CTLs or ";" or perhaps by escaping into English: extension-av = ext-av-name / ext-av-name "=" ext-av-value ext-av-name = *av-name-octet ; but neither the first nor the last is " " av-name-octet = %x20-3A / %x3C / %x3E-7E ; any CHAR except CTLs, ";", or "=" ext-av-value = *av-value-octet ; but neither the first nor the last is " " av-value-octet = %x20-3A / %x3C-7E ; any CHAR except CTLs or ";" Then again, since this grammar is for generating set-cookie-string's by well-behaved servers, perhaps whitespace in ext-av-name's should be omitted: ext-av-name = *av-name-octet av-name-octet = %x21-3A / %x3C / %x3E-7E ; any CHAR except CTLs, " ", ";", or "=" * Description of worker-based requests 5.2.2. Worker-based requests Note: The descriptions below assume that workers must be same-origin with the documents that instantiate them. If this invariant changes, we'll need to take the worker's script's URI into account when determining their status. The import of this is not clear to me. One possibility is that "Currently, all implementations require that workers must be same-origin with the documents that instantiate them." Another possibility is "We only discuss the case when a worker is same-origin with the document that instantiates it." In either case, this should include/point to a discussion of what to do when this is *not* the case, or a statement that it is out of scope for this document. * What is the universe of characters? RFC 9110 prescribes that HTTP headers be in US-ASCII. However the existence of non-HTTP APIs for the cookie store raises the possibility that a larger set of characters might be supported. As far as I can tell, this document does not specify the universe of characters. Unfortunately, this question is significant. For example in sec. 5.6 is: 1. If the set-cookie-string contains a %x00-08 / %x0A-1F / %x7F character (CTL characters excluding HTAB): Abort these steps and ignore the set-cookie-string entirely. The practical effect of this depends on the universe of characters. And in any case, it would be clearer to phrase this as follows (as it makes clearer what assertion about the string holds after this step is executed): 1. If the set-cookie-string contains one or more characters not in the set %x09 / %x20-7E (HTAB, SPC, and VCHAR): Abort these steps and ignore the set-cookie-string entirely. An overall policy/philosophy regarding the universe of characters should be stated toward the beginning of the document, and the various algorithms made consistent with it. For example in sec. 5.7, there are explicit limitations on the characters in cookie names, cookie values, and domain attributes, but not for some other parts of the value. Nits/editorial comments: As a global issue, the capitalization of "Note:" vs. "NOTE:" is inconsistent. 2.1. Conformance Criteria The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. This should be replaced with the RFC 8174 version: The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. 2.3. Terminology The terms "active browsing context", "active document", "ancestor navigables", "container document", "content navigable", "dedicated worker", "Document", "inclusive ancestor navigables", "navigable", "opaque origin", "sandboxed origin browsing context flag", "shared worker", "the worker's Documents", "top-level traversable", and "WorkerGlobalScope" are defined in [HTML]. "navigate" should be added to this list because of the later paragraph: The term "top-level navigation" refers to a navigation of a top-level traversable. Though I notice that [HTML] is not well-indexed, as it has no index/glossary of specialized terminology used in itself, so it's hard to find the definition of "navigate". 4.1.1. Syntax Servers SHOULD NOT send Set-Cookie header fields that fail to conform to the following grammar: Is it established that a "header field" is the value part of a "header" (which is what the following grammar describes)? RFC 9110 sec. 6.3 seems to indicate that the header name is part of a "header field": Fields (Section 5) that are sent or received before the content are referred to as "header fields" (or just "headers", colloquially). RFC 9110 sec. 5.1 suggests that the usual term is "field line value", or "field value" for the concatenation of the field line values of multiple headers. Similar considerations apply to sec. 4.3.1 of this document: the user agent will send a Cookie header field that conforms to the following grammar: and probably other places. 4.2.1. Syntax many popular implementations have default limits of 8K. This should probably be "8K octets", or better, "8192 octets". Compare with other length limit specifications in this document. 5.1.1. Dates 5. Abort these steps and fail to parse the cookie-date if: I would say "Abort this algorithm". There are other instances of this usage, too. 5.4. Cookie Name Prefixes To prevent these issues, UAs MUST match cookie name prefixes case- insensitive. should be "insensitively". 5.6. The Set-Cookie Header Field Let the cookie-av string be the characters consumed in this step. To this should be added, "Update the unparsed-attributes to be the part of unparsed-attributes that is not consumed." As written, the text depends on the verb "consumed" to imply this action. 5.7. Storage Model 3. If the cookie-name or the cookie-value contains a %x00-08 / %x0A-1F / %x7F character (CTL characters excluding HTAB), abort these steps and ignore the cookie entirely. Of course, such a set-cookie should have been ignored when sec. 5.6 processing was applied. But having this text is useful if the cookie store can be accessed by an API that does not go through set-cookie-string parsing. Note the technical issue about the universe of characters applies here as well. 8. If the domain-attribute contains a character that is not in the range of [USASCII] characters, abort these steps and ignore the cookie entirely. "[USASCII]" doesn't seem to be strictly defined. E.g. the term is used descriptively in sec 2.2 but a definition is not explicitly imported from RFC 5234, and 5234 doesn't use the term. Probably what is meant is "is not in CHAR", which is explicitly defined. But again, a set-cookie string that contained such characters would already have been ignored. And that a character set restriction is given specifically for the domain-attribute is odd; it seems like it should be subsumed in an overall character set policy. 5.8.1. The Cookie Header Field For example, the user agent might wish to block sending cookies during "third-party" requests from setting cookies (see Section 7.1). I can't understand this sentence. It seems to speak of "requests from setting cookies" but that makes no sense. While this specification requires that a single cookie- string be produced, some user agents may split that string across multiple cookie header fields. Does this mean "user agents MAY split that string"? Or perhaps, "user agents SHOULD NOT split the sting"? If so, it should say so explicitly rather than the current wording, which could possibly be a warning that some user agents have not-strictly-conformant behavior. 5.8.2. Non-HTTP APIs A user agent MAY return an empty cookie-string in certain contexts, such as when a retrieval occurs within a third-party context (see Section 7.1). Shouldn't this be in sec 5.8? And for that matter, wouldn't the return also be empty if there simply were no matching cookies in the store? 5.8.3. Retrieval Algorithm This sentence If this change results in a cookie's domain becoming a public suffix then that cookie is considered invalid as it would have been rejected during creation would be better as: If this change results in a cookie's domain becoming a public suffix then that cookie is considered invalid as it would have been rejected during creation had it been created now That change being within this change: NOTE: (For user agents configured to reject "public suffixes") It's possible that the public suffix list was changed since a cookie was created. If this change results in a cookie's domain becoming a public suffix then that cookie is considered invalid as it would have been rejected during creation (See Section 5.7 step 9). User agents should be careful to avoid retrieving these invalid cookies even if they domain-match the host of the retrieval's URI. should be made an explicit processing step: * For user agents configured to reject "public suffixes", the cookie's domain is not a public suffix. (It's possible that the public suffix list was changed since the cookie was created. If this change results in the cookie's domain becoming a public suffix then that cookie is considered invalid as it would have been rejected during creation if it had been created now (See Section 5.7 step 9).) -- Step 4/3 says: 3. If there is an unprocessed cookie in the cookie-list, output the characters %x3B and %x20 ("; "). For quite a while I mis-read what this meant. I think it would be much clearer if it was phrased: 3. If the cookie is not the last cookie in the cookie-list, output the characters %x3B and %x20 ("; "). 6.2. Application Programming Interfaces One reason the Cookie and Set-Cookie header fields use such esoteric syntax is that many platforms (both in servers and user agents) provide a string-based application programming interface (API) to cookies, requiring application-layer programmers to generate and parse the syntax used by the Cookie and Set-Cookie header fields, which many programmers have done incorrectly, resulting in interoperability problems. Is "esoteric" the correct word? It seems that "complex" is more correct. Actually, it's not so much that the *syntax* is complex but that the semantic processing is rather baroque, with the reason for that being historical compatibility considerations. 9.2. Set-Cookie The permanent message header field registry (see [HttpFieldNameRegistry]) ... probably would be better copying 9.1 as: The HTTP Field Name Registry (see [HttpFieldNameRegistry]) ... 9.3.1. Procedure Note that attribute names are generally defined in CamelCase, but technically accepted case-insensitively. What does "technically" mean here? Given that this section is pointed to from outside the RFC, the sentence should not require further context to understand. I think what is meant is that names are accepted case-insensitively. But really, you want to replace this sentence with something normative like: Note that attribute names are generally defined in CamelCase but MUST be recognized case-insensitively. Two attribute names that are case-insensitively the same MUST NOT be registered. 10.1. Normative References [HTML] Hickson, I., Pieters, S., van Kesteren, A., Jägenstedt, P., and D. Denicola, "HTML", n.d., <https://html.spec.whatwg.org/>. This normative reference is impossible given that the URL fetches a resource that is described as "living", meaning "changing over time", but this reference doesn't specify the version as of a specific time. Hence, it is impossible to *normatively* say that [HTML] does nor does not specify any particular thing. [END] -- last-call mailing list -- last-call@xxxxxxxx To unsubscribe send an email to last-call-leave@xxxxxxxx