On Jul 19, 2013, at 12:59, Junio C Hamano wrote:
"Kyle J. McKay" <mackyle@xxxxxxxxx> writes:
+#define URL_ALPHA
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
+#define URL_DIGIT "0123456789"
+#define URL_HEXDIGIT URL_DIGIT "ABCDEFabcdef"
+#define URL_ALPHADIGIT URL_ALPHA URL_DIGIT
+#define URL_SCHEME_CHARS URL_ALPHADIGIT "+.-"
+#define URL_HOST_CHARS URL_ALPHADIGIT ".-[:]" /* IPv6 literals
need [:] */
+#define URL_UNSAFE_CHARS " <>\"%{}|\\^`" /* plus 0x00-0x1F,
0x7F-0xFF */
+#define URL_GEN_RESERVED ":/?#[]@"
+#define URL_SUB_RESERVED "!$&'()*+,;="
+#define URL_RESERVED URL_GEN_RESERVED URL_SUB_RESERVED /* only
allowed delims */
+ ...
+ while (from_len) {
+ int ch = *from++;
+ int was_esc = 0;
+
+ from_len--;
+ if (ch == '%') {
+ if (from_len < 2 ||
+ !strchr(URL_HEXDIGIT, from[0]) ||
+ !strchr(URL_HEXDIGIT, from[1]))
I actually do like the readability of the approach in this patch,
but these repeated strchrs() in a loop may want to be optimized,
using a trick similar to what is used in ctype.c::sane_ctype[].
A small build-time-only program or script gen-http-ctype.perl that
defines and uses these URL_* cpp macros and generates a C source
file http-ctype-gen.c that can be #included from http.c, with
something like this in the Makefile:
http-ctype-gen.c: gen-http-ctype.perl
rm -f $@ $@+
$(PERL_PATH) gen-http-ctype.perl >$@+
mv $@+ $@
http.o: http.c http-ctype-gen.c
would give us both readability and efficiency, perhaps?
Hmmm. That's a very fast technique. However something like:
#define IS_HEX_DIGIT(c) \
(('0'<=(c)&&(c)<='9')||('a'<=(c)&&(c)<='f')||('A'<=(c)&&(c)<='F'))
I would think would be suitably fast without needing any added build
files.
However, looks like there is a ctype.h isxdigit() function and it
looks like there's a version of that in git-compat-util.h as well as a
convenient hexval_table to use for the conversion, so I will alter the
code to use those instead which will also do away with the
hex_digit_value() function.
If you mean for all the strchr etc. calls, multiple tables would be
required since URL_SCHEME_CHARS and URL_HOST_CHARS partially overlap,
but it could be done. Is the speed of strchr that much of a concern?
The code will only be invoked for http.<url>.* option settings in any
case and I expect the user would have to set an awfully large number
of those to even begin to notice.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html