Jeff King <peff@xxxxxxxx> 于2021年5月31日周一 上午5:09写道: > > diff --git a/builtin/cat-file.c b/builtin/cat-file.c > > index 5ebf13359e83..5f9578f9b86b 100644 > > --- a/builtin/cat-file.c > > +++ b/builtin/cat-file.c > > @@ -345,11 +345,12 @@ static void print_object_or_die(struct batch_options *opt, struct expand_data *d > > contents = read_object_file(oid, &type, &size); > > if (!contents) > > die("object %s disappeared", oid_to_hex(oid)); > > - if (type != data->type) > > - die("object %s changed type!?", oid_to_hex(oid)); > > - if (data->info.sizep && size != data->size) > > - die("object %s changed size!?", oid_to_hex(oid)); > > - > > + if (!(opt->all_objects && data->skip_object_info)) { > > + if (type != data->type) > > + die("object %s changed type!?", oid_to_hex(oid)); > > + if (data->info.sizep && size != data->size) > > + die("object %s changed size!?", oid_to_hex(oid)); > > + } > > Wouldn't checking data->skip_object_info be sufficient? It's only set > when opt->all_objects is set anyway. But more importantly, it is the > most direct root of the problem: we did not find out the type and size > earlier, so comparing anything against data->type is useless. > > But that leads me to a follow-up question: what if we did give a format, > so skip_object_info isn't set, but it didn't include the type or size? > Indeed, such a situation may arise with some format atom e.g. %(objectname) and %(rest). However %(deltabase) and %(objectdisk:size) don't have this problem because they filled typep in `packed_object_info()`. > In the size code, it looks like we explicitly protect against this by > checking if data->info.sizep is set (i.e., did we request the size from > oid_object_info_extended). But it's not for the type. > Yes, because we set typep while setting print_contents for print_object_or_die() to check different object type case, this leads to inconsistent behavior between sizep and typep. > So the assumption is that we do always fill in the type, even if the > user didn't ask for it. And that assumption is actually violated much > earlier. These two bits of code from the setup are out of order: > > if (opt->all_objects) { > struct object_info empty = OBJECT_INFO_INIT; > if (!memcmp(&data.info, &empty, sizeof(empty))) > data.skip_object_info = 1; > } > > /* > * If we are printing out the object, then always fill in the type, > * since we will want to decide whether or not to stream. > */ > if (opt->print_contents) > data.info.typep = &data.type; > > We should not let skip_object_info kick in at all if opt->print_contents > is requested. And that causes other bugs outside of the spot you found. > We look at data->type earlier in print_object_or_die() to decide whether > or not to stream the contents, but we'll see garbage (fortunately we > zero-initialize the expand_data struct, so it's at least predictably > zero, and not random undefined behavior). > > But I think we'd want to solve it by swapping the two conditionals I > showed above, which restores the assumption made in print_object_or_die(). > This method is correct. This will ensure that skip_object_info will not be set when print_contents is set. By the way, maybe we can merge this two "if (opt->all_objects)" block to one. if (opt->all_objects) { struct object_info empty = OBJECT_INFO_INIT; if (!memcmp(&data.info, &empty, sizeof(empty))) data.skip_object_info = 1; } if (opt->all_objects) { struct object_cb_data cb; if (has_promisor_remote()) warning("This repository uses promisor remotes. Some objects may not be loaded."); After the merger: if (opt->all_objects) { struct object_cb_data cb; struct object_info empty = OBJECT_INFO_INIT; if (!memcmp(&data.info, &empty, sizeof(empty))) data.skip_object_info = 1; if (has_promisor_remote()) warning("This repository uses promisor remotes. Some objects may not be loaded."); > > +test_expect_success 'cat-file --batch="batman" with --batch-all-objects will work' ' > > + git -C all-two cat-file --batch-all-objects --batch="%(objectname)" | wc -l >expect && > > + git -C all-two cat-file --batch-all-objects --batch="batman" | wc -l >actual && > > + test_cmp expect actual > > +' > > Is it worth testing both of these? The %(objectname) one will fail in > the same way (because we do not need to run oid_object_info() to get the > oid, which we already have). I'm OK doing both for better coverage, but > it may be worth mentioning either in a comment or in the commit message > that we expect both to fail, and why. > Yes, these damages need to be pointed out in the commit message. > -Peff Thanks. -- ZheNing Hu