[Last-Call] Re: Artart last call review of draft-ietf-httpbis-compression-dictionary-16

Patrick Meenan <pmeenan=40google.com@xxxxxxxxxxxxxx> · Mon, 26 Aug 2024 10:33:12 -0400

I just published draft-17 with the edits you recommended (thanks): https://datatracker.ietf.org/doc/draft-ietf-httpbis-compression-dictionary/17/
Let me know if it looks like I missed anything or there are still some areas of concern.

For the URL Pattern references, I changed it to explicitly instantiate an instance of the class and used anchor references into the spec for the class and individual methods that it calls. Hopefully that makes it clearer (the anchors are also supposed to stay stable for the long-term with the living standards).

For the fetch destination, I tweaked the language a bit to make it clear that the match-dest is optional and may not be included (and the comparison should be skipped if it is not included). As far as clients that don't support destinations, I think it would still work better if they were allowed to use the URL pattern patching without considering the dest. The match-dest is basically an additional filter to reduce the noise that a server might see.

For example, in the browser case, there are destinations for "document" (i.e. the main HTML) and "image".

I server might send back:

Use-As-Dictionary: match="/*", match-dest="document"

which would match the dictionary to any document request anywhere on the origin but leave it off for image requests, scripts, etc.

That said, there would be minimal harm in a client announcing that the dictionary was available for a given image request if the server wouldn't try to compress images using it. That would still allow the non-browser client to request resources from the same server using dictionaries without necessarily having to support the destinations (i.e. curl fetching scripts or HTML from a web server).

Thanks,

-Pat

On Sun, Aug 25, 2024 at 6:30 PM Darrel Miller <darrel@xxxxxxxx> wrote:
Hey Patrick, 

Thanks for the quick response

>> ### 2.1.1 match

>> 

>> It is concerning that a feature such as this requires taking a dependency on

>> the URL Pattern specification which is a living standard. In the HTTP API

>> space, there are many user agents that are not browsers, that will need to

>> implement URL Pattern and that specification could change at any time.  It

>> would be much preferable if this specification could take a snapshot of the

>> current URL Pattern behavior and define that in this specification.

>

>There was a LOT of bikeshedding on the match pattern. It was originally a custom 

>algorithm that only allowed for wildcard but between the w3c and HTTP working 

>groups we came to a consensus that standardizing on URL Pattern was a better 

>solution, even for non-browser clients. There are already rust and js-based libraries 

>and the expectation is that we are going to converge on using it for pattern matching 

>in a lot more cases and that there will be libraries available for most platforms to

>make integration easier.

>

>As far as taking a snapshot, this was discussed during the IESG telechat but the 

>standard practice for referencing the living standards is to not reference a snapshot 

>and that the standard maintainers are responsible for maintaining backward 

>compatibility. The same goes for the references into the fetch spec.

I can imagine there was plenty of debate around this. I did try looking for it in the mailing lists and GH but my search skills failed me.

I was unaware of a commitment from the standards maintainers to keep backward compatibility.  That eases my concern considerably.

>> ### 2.1.2 match-dest

>> 

>> It is unclear why match-dest would not be a IANA registry of values that are

>> seeded with the values from the Fetch specification. This would allow for

>> values to be added to the registry in order to support the same concept in

>> different user agents that do not use the Fetch specification.  It seems

>> strange to only allow this feature to be used if the Fetch specification is

>> being used to make requests. Is the destination feature not useful to a broader

>> audience?

>

>At some level the set of destinations needs to be maintained in such a way that even 

>an IANA list would not contradict the list in the Fetch standard as the Fetch standard 

>evolves. That would involve keeping them in sync in such a way that additions to 

>either list don't collide with the other. Fundamentally that would mean that either an 

>IANA registry would need to reference Fetch and maintain additional destinations or 

>that Fetch would need to defer to an IANA registry. At some level it is not that 

>different from the registry of link relation types. I'd be ok with requesting a new IANA 

>registry if everyone thinks that's the right path but I'm also a bit worried if the w3c 

>side would agree that deferring registration of fetch destinations to IANA was 

>appropriate.

I understand the desire to avoid having to sync a IANA registry with another specification.  My primary reason for suggesting it was to provide an extensibility point for other clients that might have other "destinations" that are not part of the Fetch standard.  One suggestion might be to have the specification refer to fetch and a registry for "extension destinations".  This does seem like a lot of effort for something that might never be used.  I suppose if it becomes clear that other user-agents want to use destinations there could always be an update to the specification at that point.

> To some extent, the CORS processing also requires a fetch-like client (or for the client 

> to not be sensitive to CORS).

I have yet to see non-browser clients attempt to emulate CORS. Although I am starting to see similar things being invented in AI orchestrators that support third party plugins.

> Would it be better if I make the match-dest matching optional on the client even if it is

> specified in the response? The intent is for it to be compatible in that the client will 

> advertise dictionaries but it is up to the server to decide to use it or not so if the 

> additional filtering provided by match-dest isn't applied and the client advertises an 

> inappropriate dictionary, it would just be ignored.

I think if a server decides it wants to partition the dictionaries based on destination, I think all clients should respect those partitions, regardless of whether they are meaningful to the client. I must admit I am struggling to understand what the impact is on non-browser clients.  Mostly because of my lack of familiarity with the purpose of request-destinations in browsers.

>> ### 2.1.4 type

>> 

>> It is not obvious what the value of this property is.  It has only one value

>> "raw", which is the default value which is described as an "unformatted blob of

>> bytes". It is stated that if a client receives a dictionary of a type that it

>> does not understand, it must not use the dictionary. But type has only one

>> value. How can any other value be returned and be compliant with this

>> specification? There is no described mechanism of how other values for type

>> could be introduced.

>> 

>> Said another way, what is lost if we drop this section 2.1.4 completely?

>"type" is there for future-looking backward compatibility. For example, Brotli and 

>ZStandard both have encoding-specific dictionary formats that provide some more 

>capabilities. If, at some point in the future, a spec decides to use the same dictionary 

>negotiation for one of those types, using an unknown "type" would allow existing 

>clients to ignore the formats that they do not understand. Otherwise, any future specs 

>would have to use a new set of headers entirely (which is an option but would be 

>duplicating a lot). Since the same response would never be two different types of 

>dictionary, having an optional value that allows for forward/backward compatibility felt 

>like a low bar.

Ah. I misread the what was written. I read the following to mean that the only valid value was "raw". i.e. no other value was allowed.

   "raw" is the only defined dictionary format 

Perhaps just removing the "only" might avoid the confusion.

   "raw" is defined as a dictionary format which represents..

>> #### 2.2.2 step 7

>> 

>> The instructions suggest to run the "test" method.  Looking at the URL Pattern

>> specification it is not immediately clear what the behaviour of the "test"

>> method is. There is a test method defined in some IDL, but it does not

>> reference any defined behaviour.  Looking at the section "High Level

>> Operations" it might be reasonable to assume that the "test" method implements

>> the "match" operation.  It would be helpful to clarify this in the

>> specification.

> The PATTERN in the algorithm is explicitly an instance of the URLPattern class which has 

> the "test" method and operation defined: https://urlpattern.spec.whatwg.org/#dom-

> urlpattern-test

> Should I be referencing it in another way to be clear that that is the IDL that it is

> referencing and that the method steps are in the URLPattern spec (or for clarity of 

> reading, just a bit more text to "run the 'test' method which executes the URL matching 

> algorithm"?

It might be helpful for those of us who are not familiar with WhatWG style specifications to spell out 

6. Let PATTERN be a instance of the URLPattern class [URLPattern] constructed by setting input=MATCH, and baseURL=BASEURL.

This will help overcome the fact that the reference points to the entire specification.  I had also not noticed that the "test" method in the IDL was a hyperlink to the description of the behavior.  I suppose it is going to take a while learn a new set of conventions.

Darrel
-- 
last-call mailing list -- last-call@xxxxxxxx
To unsubscribe send an email to last-call-leave@xxxxxxxx