On Mon, Sep 26, 2016 at 03:32:44PM -0400, David Turner wrote: > From: Jeff King <peff@xxxxxxxx> > > When the tree-walker runs into an error, it just calls > die(), and the message is always "corrupt tree file". > However, we are actually covering several cases here; let's > give the user a hint about what happened. > > Let's also avoid using the word "corrupt", which makes it > seem like the data bit-rotted on disk. Our sha1 check would > already have found that. These errors are ones of data that > is malformed in the first place. > > Signed-off-by: David Turner <dturner@xxxxxxxxxxxx> > Signed-off-by: Jeff King <peff@xxxxxxxx> Yay. This has been on my "to look at and repost" list for literally 2 years. Thanks for picking it up (see kids, procrastination _does_ pay off). > t/t1007-hash-object.sh | 15 +++++++++++++-- > t/t1007/tree-with-empty-filename | Bin 0 -> 28 bytes > t/t1007/tree-with-malformed-mode | Bin 0 -> 39 bytes Ooh, and tests. Exciting. > -test_expect_success 'corrupt tree' ' > +test_expect_success 'too-short tree' ' > echo abc >malformed-tree && > - test_must_fail git hash-object -t tree malformed-tree > + test_must_fail git hash-object -t tree malformed-tree 2>err && > + grep "too-short tree object" err > +' Should this be test_i18ngrep? Even if the message is not translated now, it seems like a good proactive measure (and probably it _should_ be translated). > +test_expect_success 'malformed mode in tree' ' > + test_must_fail git hash-object -t tree ../t1007/tree-with-malformed-mode 2>err && > + grep "malformed mode in tree entry for tree" err > +' This ".." will break when the test is run with "--root". You should use "$TEST_DIRECTORY"/t1007/... instead. And ditto in the second test, of course. > diff --git a/t/t1007/tree-with-empty-filename b/t/t1007/tree-with-empty-filename > new file mode 100644 > index 0000000000000000000000000000000000000000..aeb1ceb20e485eebd0acbb81c974d1c6fedcc1fe > GIT binary patch > literal 28 > kcmXpsFfcPQQDAsB_tET47q2;ccWbUIkGgT_Nl)-Z0Hx{;SO5S3 > > literal 0 > HcmV?d00001 This is rather opaque, of course. :) I wonder if it would be possible to generate the test vector with something like: # any 20 bytes will do bin_sha1='\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0' printf "100644 \0$bin_sha1" >tree-with-empty-filename I know that is longer and possibly more error-prone to run, but I think it makes the test much easier to read and modify later. I also wonder if $bin_sha1 should actually be more like: hex_sha1=$(echo foo | git hash-object --stdin -w) bin_sha1=$(echo $hex_sha1 | perl -ne 'printf "\\%3o", ord for /./g') so that it's a real sha1 (or maybe it is in your original, from an object that happens to be in the repo; it's hard to tell). I wouldn't expect the code to actually get to the point of looking at the sha1, but it's perhaps a more realistic test. I also think it would be nice if hash-object had a "--binary-sha1" option to avoid the perl grossness. :) > diff --git a/tree-walk.c b/tree-walk.c > index ce27842..ba544cf 100644 The code change itself looks brilliant, naturally. :) -Peff