On 2023-01-26 02:31, Jeff King wrote: > On Fri, Jan 20, 2023 at 10:08:49PM +0000, Matthew John Cheetham via GitGitGadget wrote: > >> From: Matthew John Cheetham <mjcheetham@xxxxxxxxxxx> >> >> Read and store the HTTP WWW-Authenticate response headers made for >> a particular request. >> >> This will allow us to pass important authentication challenge >> information to credential helpers or others that would otherwise have >> been lost. > > Makes sense, and the code looks pretty reasonable overall. > > A few observations: > >> @@ -115,6 +116,19 @@ struct credential { >> */ >> struct string_list helpers; >> >> + /** >> + * A `strvec` of WWW-Authenticate header values. Each string >> + * is the value of a WWW-Authenticate header in an HTTP response, >> + * in the order they were received in the response. >> + */ >> + struct strvec wwwauth_headers; >> + >> + /** >> + * Internal use only. Used to keep track of split header fields >> + * in order to fold multiple lines into one value. >> + */ >> + unsigned header_is_last_match:1; >> + > > Stuffing this into a "struct credential" feels a little weird, just > because it's specific to http parsing (especially this internal flag). > And the credential code is seeing full header lines, not broken down at > all. > > I guess I would have expected some level of abstraction here between the > credential subsystem and the http subsystem, where the latter is parsing > and then sticking opaque data into the credential to ferry to the > helpers. > > But it probably isn't that big a deal either way. Even though there are > non-http credentials, it's not too unreasonable for the credential > system to be aware of http specifically. I had considered possibly introducing an opaque property-bag style of 'protocol-specific properties' that, for example, http.c would add the WWW-Authenticate headers to as something like `http.wwwauth[]`. Other protocols (like smtp:// or cert://) could add their own properties if they needed or wanted to also. Thoughts? >> +static size_t fwrite_wwwauth(char *ptr, size_t eltsize, size_t nmemb, void *p) >> +{ >> + size_t size = st_mult(eltsize, nmemb); > > Here's that st_mult() again. Same comment as the previous patch. :) Yeah I'm gonna drop this. Your arguments make sense; it's not going to be a problem in reality :-) >> + /* >> + * Header lines may not come NULL-terminated from libcurl so we must >> + * limit all scans to the maximum length of the header line, or leverage >> + * strbufs for all operations. >> + * >> + * In addition, it is possible that header values can be split over >> + * multiple lines as per RFC 2616 (even though this has since been >> + * deprecated in RFC 7230). A continuation header field value is >> + * identified as starting with a space or horizontal tab. >> + * >> + * The formal definition of a header field as given in RFC 2616 is: >> + * >> + * message-header = field-name ":" [ field-value ] >> + * field-name = token >> + * field-value = *( field-content | LWS ) >> + * field-content = <the OCTETs making up the field-value >> + * and consisting of either *TEXT or combinations >> + * of token, separators, and quoted-string> >> + */ >> + >> + strbuf_add(&buf, ptr, size); > > OK, so we just copy the buffer. I don't think it would be too hard to > handle the buffer as-is, but this does make things a bit easier. Given > that we're going to immediately throw away the copy for anything except > www-authenticate, we could perhaps wait until we've matched it. That > does mean trimming the CRLF ourselves and using skip_prefix_mem() to > match the start (you'd want skip_iprefix_mem(), of course, but it > doesn't yet exist; I'll leave that as an exercise). Fair point! I can replace most of these with operations over the curl ptr. > Maybe not worth it to save a few allocations, as an http request is > already pretty heavyweight. Mostly I flagged it because this is going to > run for every header of every request, even though most requests won't > trigger it at all. > >> + /* Strip the CRLF that should be present at the end of each field */ >> + strbuf_trim_trailing_newline(&buf); >> + >> + /* Start of a new WWW-Authenticate header */ >> + if (skip_iprefix(buf.buf, "www-authenticate:", &val)) { >> + while (isspace(*val)) >> + val++; >> + >> + strvec_push(values, val); >> + http_auth.header_is_last_match = 1; >> + goto exit; >> + } > > OK, this looks correct from my knowledge of the RFCs. I saw something > about isspace() matching newlines, etc, in an earlier thread, but I > think we'd never see a newline here, as we're expecting curl to be > splitting on our behalf. > >> + /* >> + * This line could be a continuation of the previously matched header >> + * field. If this is the case then we should append this value to the >> + * end of the previously consumed value. >> + * Continuation lines start with at least one whitespace, maybe more, >> + * so we should collapse these down to a single SP (valid per the spec). >> + */ >> + if (http_auth.header_is_last_match && isspace(*buf.buf)) { >> + /* Trim leading whitespace from this continuation hdr line. */ >> + strbuf_ltrim(&buf); > > OK, makes sense. This will memmove(), which is needlessly inefficient > (we could just advance a pointer), but probably not a big deal in > practice. Using the strbuf functions is a nice simplification. > >> + /* >> + * At this point we should always have at least one existing >> + * value, even if it is empty. Do not bother appending the new >> + * value if this continuation header is itself empty. >> + */ >> + if (!values->nr) { >> + BUG("should have at least one existing header value"); >> + } else if (buf.len) { >> + char *prev = xstrdup(values->v[values->nr - 1]); >> + >> + /* Join two non-empty values with a single space. */ >> + const char *const sp = *prev ? " " : ""; >> + >> + strvec_pop(values); >> + strvec_pushf(values, "%s%s%s", prev, sp, buf.buf); >> + free(prev); >> + } > > Likewise here we end up with an extra allocation of "prev", just because > we can't pop/push in the right order. But that's probably OK in > practice, as this is triggering only for the header we care about. > > The concatenation itself makes the whole thing quadratic, but unless we > are worried about a malicious server DoS-ing us with a billion > www-authenticate continuations, I think we can disregard that. > > -Peff