On Thu, Jan 05, 2012 at 01:55:22PM -0800, Junio C Hamano wrote: > Jeff King <peff@xxxxxxxx> writes: > > > The worst potential problem I could come up with is if you somehow had > > an object whose "parsed" flag was set, but somehow didn't have its other > > fields set (like type). > > ... > > So I think it is safe short of somebody doing some horrible manual > > munging of a "struct object". > > Yeah, I was worried about codepaths like commit-pretty-printing might be > mucking with the contents of commit->buffer, perhaps reencoding the text > and then calling parse_object() to get the unmodified original back, or > something silly like that. But the lookup_object() call at the beginning > of the parse_object() already prevents us from doing such a thing, so we > should be OK, I would think. Er, without my patch there is no such lookup_object, is there? What saves you is that the parse_*_buffer functions all do nothing when the object.parsed flag is set, and the code I added makes sure that object.parsed is set in the object that lookup_object returns. So yeah, anytime you tweak the contents of commit->buffer but don't unset the "parsed" flag, you are asking for trouble. Here's another possible code path where the behavior is changed: 1. You set the global save_commit_buffer to 0. 2. You call parse_commit(commit) on an unparsed commit object, which does not save the commit buffer, but does set commit->object.parsed. 3. You call parse_object(commit->object.sha1). a. Without my patch, we read the file contents again, do _not_ re-parse them (because we look up the existing object and notice that its "parsed" flag is set), but we _do_ assign the buffer to commit->buffer. b. With my patch, we see that there is an existing object that is already parsed, and return early. commit->buffer remains NULL. I would argue that this doesn't matter, since "parse_commit" uses the exact same optimization (it returns early without setting commit->buffer if the parsed flag is set). So any program turning off save_commit_buffer has to be ready to deal with a NULL commit->buffer in the first place. The only exception would be a program that then tries to fill in the commit->buffer field by manually running parse_object on an already-parsed, buffer-less commit object. I don't think we do that. You can verify that commit->buffer is the only place where these issues can happen by following the logic in parse_object_buffer. Sorry to belabor the discussion, but this is such a core piece of code, I want to make sure the optimization isn't hurting anybody (I don't think it is, and certainly the tests are all happy, but I think talking through the cases is a good thing). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html