On Thu, May 11, 2017 at 3:50 PM, Johannes Schindelin <johannes.schindelin@xxxxxx> wrote: > The real issue here is that GNU awk's regex implementation assumes a bit > too much about the relative sizes of pointers and long integers. What they > really want is to use intptr_t. > > This patch recapitulates what 56a1a3ab449 (Silence GCC's "cast of pointer > to integer of a different size" warning, 2015-10-26) did to our previous > copy of GNU awk's regex engine. > > Signed-off-by: Johannes Schindelin <johannes.schindelin@xxxxxx> > --- > Published-As: https://github.com/dscho/git/releases/tag/compat-regex-fixes-v1 > Fetch-It-Via: git fetch https://github.com/dscho/git compat-regex-fixes-v1 > > .../0003-Use-intptr_t-instead-of-long.patch | 22 ++++++++++++++++++++++ > compat/regex/regcomp.c | 4 ++-- > 2 files changed, 24 insertions(+), 2 deletions(-) > create mode 100644 compat/regex/patches/0003-Use-intptr_t-instead-of-long.patch > > diff --git a/compat/regex/patches/0003-Use-intptr_t-instead-of-long.patch b/compat/regex/patches/0003-Use-intptr_t-instead-of-long.patch > new file mode 100644 > index 00000000000..246ff256fb8 > --- /dev/null > +++ b/compat/regex/patches/0003-Use-intptr_t-instead-of-long.patch > @@ -0,0 +1,22 @@ > +diff --git a/compat/regex/regcomp.c b/compat/regex/regcomp.c > +index 5e9ea26cd46..e6469167a80 100644 > +--- a/compat/regex/regcomp.c > ++++ b/compat/regex/regcomp.c > +@@ -2641,7 +2641,7 @@ parse_dup_op (bin_tree_t *elem, re_string_t *regexp, re_dfa_t *dfa, > + old_tree = NULL; > + > + if (elem->token.type == SUBEXP) > +- postorder (elem, mark_opt_subexp, (void *) (long) elem->token.opr.idx); > ++ postorder (elem, mark_opt_subexp, (void *) (intptr_t) elem->token.opr.idx); > + > + tree = create_tree (dfa, elem, NULL, (end == -1 ? OP_DUP_ASTERISK : OP_ALT)); > + if (BE (tree == NULL, 0)) > +@@ -3868,7 +3868,7 @@ create_token_tree (re_dfa_t *dfa, bin_tree_t *left, bin_tree_t *right, > + static reg_errcode_t > + mark_opt_subexp (void *extra, bin_tree_t *node) > + { > +- int idx = (int) (long) extra; > ++ int idx = (int) (intptr_t) extra; > + if (node->token.type == SUBEXP && node->token.opr.idx == idx) > + node->token.opt_subexp = 1; > + > diff --git a/compat/regex/regcomp.c b/compat/regex/regcomp.c > index 5e9ea26cd46..e6469167a80 100644 > --- a/compat/regex/regcomp.c > +++ b/compat/regex/regcomp.c > @@ -2641,7 +2641,7 @@ parse_dup_op (bin_tree_t *elem, re_string_t *regexp, re_dfa_t *dfa, > old_tree = NULL; > > if (elem->token.type == SUBEXP) > - postorder (elem, mark_opt_subexp, (void *) (long) elem->token.opr.idx); > + postorder (elem, mark_opt_subexp, (void *) (intptr_t) elem->token.opr.idx); > > tree = create_tree (dfa, elem, NULL, (end == -1 ? OP_DUP_ASTERISK : OP_ALT)); > if (BE (tree == NULL, 0)) > @@ -3868,7 +3868,7 @@ create_token_tree (re_dfa_t *dfa, bin_tree_t *left, bin_tree_t *right, > static reg_errcode_t > mark_opt_subexp (void *extra, bin_tree_t *node) > { > - int idx = (int) (long) extra; > + int idx = (int) (intptr_t) extra; > if (node->token.type == SUBEXP && node->token.opr.idx == idx) > node->token.opt_subexp = 1; > > > base-commit: 4e23cefb4da69a2d884c2d5a303825f40008ca42 > -- > 2.12.2.windows.2.800.gede8f145e06 Let's drop this current gawk import series. After talking to the gawk author it turns out it's better to use the version from gnulib, this includes the equivalent of your patch. The following one-liner works for me on linux to import that library, on the master branch: $ git reset --hard; rm compat/regex/*.[ch]; rm -rfv /tmp/git.rx; test -e /tmp/git-gnulib || git clone https://git.savannah.gnu.org/git/gnulib.git /tmp/git-gnulib; mkdir /tmp/git.rx; touch /tmp/git.rx/configure.ac; /tmp/git-gnulib/gnulib-tool --lgpl --add-import --dir=/tmp/git.rx regex; cp /tmp/git.rx/lib/{intprops.h,reg*} compat/regex/; perl -0666 -pi.bak -e 's[compat/regex/regex.o: EXTRA_CPPFLAGS = \K[^\n]+\n[^\n]+][]s' Makefile; echo '#define _GNU_SOURCE' >compat/regex/config.h I.e. remove the existing engine, import new one from gnulib, then wipe the extra -D flags that exist now, and define _GNU_SOURCE.