On Mon, Feb 01, 2010 at 11:48:29AM +0100, Ellié Computing Open Source Program wrote: > C:\temp\scc-tests\git>git ls-tree -r HEAD "caractère spécial" > 100644 blob bf10a8b39e72c754ee1872fcdb13662cba6a8880 "caract\350re > sp\351cial/\272plouf.txt" > > Note the spurious \272 which comes in the listing :( > Trying again the same commands may give other spurious characters > (each time we tried we get different _bad_ responses) Looks like a bug. I was able to replicate it here, and valgrind notices it, too: ==22720== Invalid read of size 1 ==22720== at 0x80E77FF: next_quote_pos (quote.c:174) ==22720== by 0x80E783A: quote_c_style_counted (quote.c:215) ==22720== by 0x80E7D14: write_name_quotedpfx (quote.c:286) ==22720== by 0x80808F3: show_tree (builtin-ls-tree.c:114) ==22720== by 0x811000D: read_tree_recursive (tree.c:114) ==22720== by 0x81100E7: read_tree_recursive (tree.c:131) ==22720== by 0x8080CC2: cmd_ls_tree (builtin-ls-tree.c:173) ==22720== by 0x804B7FA: run_builtin (git.c:257) ==22720== by 0x804B958: handle_internal_command (git.c:412) ==22720== by 0x804BA2F: run_argv (git.c:454) ==22720== by 0x804BB97: main (git.c:525) ==22720== Address 0x43405b4 is 0 bytes after a block of size 20 alloc'd ==22720== at 0x4024C4C: malloc (vg_replace_malloc.c:195) ==22720== by 0x8115739: xmalloc (wrapper.c:20) ==22720== by 0x811005E: read_tree_recursive (tree.c:127) ==22720== by 0x8080CC2: cmd_ls_tree (builtin-ls-tree.c:173) ==22720== by 0x804B7FA: run_builtin (git.c:257) ==22720== by 0x804B958: handle_internal_command (git.c:412) ==22720== by 0x804BA2F: run_argv (git.c:454) ==22720== by 0x804BB97: main (git.c:525) The patch below fixes it for me. This is the first time I've ever looked at this code, though, so an extra set of eyes is appreciated. I'm also not sure of the "!p[len]" termination that the loop uses (quoted in the context below). The string is explicitly not NUL-terminated, so why would that matter? I think that may have been covering up the bug in some cases. -- >8 -- Subject: [PATCH] fix invalid read in quote_c_style_counted We progress through a length-bounded string, looking for characters in need of quoting. After each character is found, we output everything up until that character literally, then the quoted character. We then advance our string and look again. However, we never actually decremented the length, meaning we ended up looking at whatever random junk was stored after the string. Signed-off-by: Jeff King <peff@xxxxxxxx> --- quote.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/quote.c b/quote.c index acb6bf9..392006d 100644 --- a/quote.c +++ b/quote.c @@ -216,20 +216,21 @@ static size_t quote_c_style_counted(const char *name, ssize_t maxlen, if (len == maxlen || !p[len]) break; if (!no_dq && p == name) EMIT('"'); EMITBUF(p, len); EMIT('\\'); p += len; ch = (unsigned char)*p++; + maxlen -= len + 1; if (sq_lookup[ch] >= ' ') { EMIT(sq_lookup[ch]); } else { EMIT(((ch >> 6) & 03) + '0'); EMIT(((ch >> 3) & 07) + '0'); EMIT(((ch >> 0) & 07) + '0'); } } EMITBUF(p, len); -- 1.7.0.rc1.16.g21332.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html