See: https://github.com/httpwg/http-extensions/pull/3016 > On 28 Feb 2025, at 11:46 am, Mark Nottingham <mnot@xxxxxxxx> wrote: > > Roy, > > Just to clarify: > >>> There is no >>> benefit for legit clients to send more identifiers than are actually in use. >>> Servers can choose to respond with a 431 if the number/size received >>> seems unreasonable. > > The servers set the header, not clients, so there isn't an opportunity to refuse if it's too big. > > That said, I'd support lowering the requirements to the levels you suggest. Anyone see a reason not to? > > Re: Cache-Group-Invalidation -- the use case we had in mind is extending invalidation to related resources when clients make a state-changing request. Extending that to effectively do piggyback cache invalidation (remember that? Hey Bala :) is a pretty big change. I'm not against it, but would want to think through the implications first. > > Cheers, > > >> On 27 Feb 2025, at 6:05 am, Roy T. Fielding <fielding@xxxxxxxx> wrote: >> >>> On Feb 12, 2025, at 7:35 AM, The IESG <iesg-secretary@xxxxxxxx> wrote: >>> >>> >>> The IESG has received a request from the HTTP WG (httpbis) to consider the >>> following document: - 'HTTP Cache Groups' >>> <draft-ietf-httpbis-cache-groups-03.txt> as Proposed Standard >> >> While doing a header field review I ran across the below requirements in >> sections 2 (Cache Groups) and 3 (Cache-Group-Invalidation). >> Implementations MUST support at least 128 groups in a field value, >> with up to at least 128 characters in each member. Note that generic >> limitations on HTTP field lengths may constrain the size of this >> field value in practice. >> >> Why would we want to require implementations to support that many? >> What is the protocol interoperability problem that this is trying to solve? >> >> I mean, literally, require that an implementation receive and process an >> HTTP header field with a 26 character field name, colon, space, 256 >> double quotes, 127 comma separators, and 16,384 identifying characters? >> For cache efficiency? >> >> The original motivation seems to be documented at >> >> https://github.com/httpwg/http-extensions/issues/2701 >> >> as the average of the documentation claims within existing similar >> implementations that use internal APIs and response bodies to >> communicate the group names. Except the issue also fails to count >> the syntax delimiters, so the requirement above actually exceeds >> all current implementations. >> >> To be clear, it's fine if someone does implement that much support. >> But why should the protocol require it as a minimum MUST? >> Why not let systems implement the protocol while only allowing >> 24 groups per message, or identifiers with no more than 64 characters? >> >> If a given CDN only allows a dozen identifiers per cache entry, why >> would a client need to parse more in a received response? >> If a client sends more than a sensible number of cache groups in a >> single message, why should the server be required to process them? >> Likewise, if a server receives a list of internally invalid group names >> (names aren't opaque to the origin) does it actually have to process >> them all or can it simply drop the client? >> >> IOW, what is the interop purpose of that requirement, and when >> is it applicable to a given HTTP message? >> >> For our CMS (Adobe Experience Manager with Edge Delivery Services), >> we have implemented this style of cache invalidation on several different >> CDNs (i.e., each with their own proprietary APIs: using surrogate keys >> on Fastly, cache tags on Cloudlfare, etc.). It has been a very effective >> strategy for deploying low-latency sites with push invalidation. >> >> I am pretty sure that we have never needed more than six >> concurrent keys/tags for any given resource. Four is the average. >> Cache groups are a scoping mechanism that are typically based on >> how content is generated or when it was released, so they tend to >> be values for "whole-site", "release-tag"", "back-end", and >> "this page", with only the latter being resource-specific. >> >> This is a controlled namespace, usually allocated by the origin >> and operated upon only by clients that have authenticated with that >> same origin. They don't need to support more identifiers than >> what they have already defined (and not yet invalidated). There is no >> benefit for legit clients to send more identifiers than are actually in use. >> Servers can choose to respond with a 431 if the number/size received >> seems unreasonable. >> >> "Implementations" that want to mint an exceptionally large number of >> groups with very long names can allocate their own resources to do so. >> >> Part of the problem here is that the requirement above assumes that >> "Implementations" means "a CDN"; more specifically, a CDN of global >> scale that expects to interop with any application built for existing >> globally-scaled CDNs. >> >> But such a requirement simply doesn't make sense for this protocol. >> We don't need all CDNs to be as large and capable as the very best. >> We only need them to be large enough for our application's needs. >> >> In any case, it should be clear that "Implementation" normally means >> a client or a server. If this is supposed to be a requirement only on CDNs >> (as a whole or on specific types), then it should say that. >> >> The protocol itself doesn't need a specific number. It only needs to >> define the syntax and explain how to respond when there are too >> many identifiers or an identifier that is too long. Let the market for >> this feature figure out what the minimums should be. >> >> In any case, somewhere in the spec should be a very loud statement >> that long identifiers increase the cost and latency between the origin >> and downstream caches (which typically strip these header fields >> before delivery to a user agent). >> >> Do CDNs need a common minimum for implementation? Maybe. >> I would make that a minimum of 32-character names and 16 groups >> per header field list. It may not sound like much, but that's more than >> enough to make the protocol useful. I'd recommend support for up >> to 64-char names (for 512-bit hex hashes), but have >> seen no need to implement that in practice. Using long names to >> implement a low-latency system is a spectacular foot-gun. >> >> = = = = >> >> As a separate issue, Section 3 (Cache-Group-Invalidation) has the requirement >> >> The Cache-Group-Invalidation header field MUST be ignored on >> responses to requests that have a safe method (e.g., GET; see >> Section 9.2.1 of [HTTP]). >> >> >> and doesn't explain why. My guess is that this requirement is >> misunderstanding the limitation on side-effects of safe methods. >> >> Invalidating a previously cached response is not a side-effect of >> the request; it's a statement by the origin. It isn't the method that >> has the side-effect. >> >> There might be other reasons to ignore it in responses to GET, but >> I don't know what they would be and the spec doesn't help. >> >> The reason why this is important is because the vast majority of >> communication between an origin and caches/CDNs currently >> takes place within cache-control (or CDN-specific fields) in >> 2xx/304 responses to GET requests. >> >> Having this mechanism be arbitrarily limited to POST or DELETE >> responses is effectively requiring something like an administrative >> form or API, in which case there should be a response body defining >> the list of what groups to invalidate, in a format defined by this spec, >> and we wouldn't need the Cache-Group-Invalidation header field. >> >> In practice, what I would want (and expect) is that the header field >> can be present in GET responses as an advisory note. IOW, tell >> the downstream that these identified groups of content can be >> marked as stale and thus can be evicted from resource-constrained >> caches (prior to evicting fresh content). >> >> Furthermore, if a recipient can trust the origin (like a CDN can >> trust its contract), then the recipient can be configured to invalidate >> the identified cache groups upon seeing any secured response with >> that field from the applicable origin, regardless of request method. >> >> Cheers, >> >> Roy T. Fielding >> Senior Principal Scientist, Adobe >> > > -- > Mark Nottingham https://www.mnot.net/ > -- Mark Nottingham https://www.mnot.net/ -- last-call mailing list -- last-call@xxxxxxxx To unsubscribe send an email to last-call-leave@xxxxxxxx