Re: [PATCH v2 4/8] http: extract type/subtype portion of content-type

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On May 22, 2014, at 02:29, Jeff King wrote:

When we get a content-type from curl, we get the whole
header line, including any parameters, and without any
normalization (like downcasing or whitespace) applied.
If we later try to match it with strcmp() or even
strcasecmp(), we may get false negatives.

This could cause two visible behaviors:

 1. We might fail to recognize a smart-http server by its
    content-type.

 2. We might fail to relay text/plain error messages to
    users (especially if they contain a charset parameter).

This patch teaches the http code to extract and normalize
just the type/subtype portion of the string. This is
technically passing out less information to the callers, who
can no longer see the parameters. But none of the current
callers cares, and a future patch will add back an
easier-to-use method for accessing those parameters.

Signed-off-by: Jeff King <peff@xxxxxxxx>
---
http.c                     | 32 +++++++++++++++++++++++++++++---
remote-curl.c              |  2 +-
t/lib-httpd/error.sh       |  8 +++++++-
t/t5550-http-fetch-dumb.sh |  5 +++++
4 files changed, 42 insertions(+), 5 deletions(-)

diff --git a/http.c b/http.c
index 94e1afd..4edf5b9 100644
--- a/http.c
+++ b/http.c
@@ -906,6 +906,29 @@ static CURLcode curlinfo_strbuf(CURL *curl, CURLINFO info, struct strbuf *buf)
	return ret;
}

+/*
+ * Extract a normalized version of the content type, with any
+ * spaces suppressed, all letters lowercased, and no trailing ";"
+ * or parameters.
+ *
+ * Example:
+ *   "TEXT/PLAIN; charset=utf-8" -> "text/plain"
+ */
+static void extract_content_type(struct strbuf *raw, struct strbuf *type)
+{
+	const char *p;
+
+	strbuf_reset(type);
+	strbuf_grow(type, raw->len);
+	for (p = raw->buf; *p; p++) {
+		if (isspace(*p))
+			continue;
+		if (*p == ';')
+			break;
+		strbuf_addch(type, tolower(*p));
+	}
+}
+

This will parse invalid content types as valid. Probably not important since the producer of an invalid content type shouldn't be depending on any particular behavior by the consumer of such a type, but I think it warrants a note in the comment block, perhaps something like:

  * Note that an invalid content-type may be converted to a valid one

or some such.

--Kyle
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]