On Saturday 09 June 2007, Johannes Schindelin wrote: > Hi, Hi. Thanks for taking the time to look at (some of) my patch. Most of your questions below can be answered with a single answer: The main purpose of the patch is (as the subject line says) to bring the two functions more in line with eachother. At the time I made the patch, I had made the observation that these function were trying to do much the same thing, albeit in a slightly different form. This patch is therefore about applying a series of (mostly non-functional) refactorings to make their diff as small as possible. This involves "stupid" changes such as renaming variables, tweaking whitespace, reordering the declaration of variables, etc. It's all to make the functions similar to the point where I can diff them, get a small and meaningful result, see the remaining _real_ differences, and in the end, _merge_ them (see patches 7-9). If this whole exercise didn't end up with merging the two functions into one, I would _totally_ agree with you that all this refactoring is more harmful than beneficial. > On Sat, 9 Jun 2007, Johan Herland wrote: > > if (size < 64) > > return error("wanna fool me ? you obviously got the size wrong !"); > > > > - buffer[size] = 0; > > Are you sure that your buffer is always NUL terminated? First, (and you'll see this in the commit message) I'm _moving_ (not removing) the NUL termination out of verify_tag() and into main() (which I can be sure is the only caller of verify_tag(), since verify_tag is declared static, and there is no other call in that file). Two reasons for doing this: 1. Make verify_tag more similar to parse_tag_buffer() (because parse_tag_buffer() does not NUL terminate) 2. Do the NUL termination as close to the code that actually populated the buffer with data (the read_pipe() in main()) So now you can ask: Why doesn't parse_tag_buffer() NUL terminate its input? It _that_ safe? And I ran around checking all the callers of parse_tag_buffer, and found that all of them use data (most of which originates from read_sha1_file()) that's already NUL terminated. In the end, I also put in a comment on the resulting function (parse_and_verify_tag_buffer()), explicitly saying the given data _must_ be NUL terminated. Side note: At first I actually thought the manual NUL termination could cause a buffer overflow (i.e. if the given size was the same as the allocated size), so I actually have a version of all of this where I _don't_ assume the buffer is NUL-terminated at all, and put in lots of bounds checking, replace strchr() with memchr(), etc. I then took a hard look at read_pipe(), and discovered that if you use it to fill a 4096-byte buffer with 4096 bytes of data, it actually _will_ reallocate to 8192 bytes and leave room for the NUL termination (and much more) (I believe this should have been documented in read_pipe). Thus the NUL termination was safe all along. > > - type_line = object + 48; > > + type_line = data + 48; > > Quite a lot of changes seem to do this object->data. The patch would have > been much more compact if you just had renamed buffer to object instead of > data. Yes, but it would have made the aforementioned diff to parse_tag_buffer() larger. > > /* TODO: check for committer info + blank line? */ > > /* Also, the minimum length is probably + "tagger .", or 63+8=71 */ > > > > /* The actual stuff afterwards we don't care about.. */ > > return 0; > > -} > > > > #undef PD_FMT > > +} > > Any particular reason for this? Well, PD_FMT is only used inside the function, so I found it easier to move the #definition of PD_FMT inside the function to indicate the scope (_perceived_ scope; I know it hasn't any effect on the compiled code). But since the whole function is going away in a few patches anyway, I should have probably left it out of this patch entirely. > > @@ -124,6 +120,7 @@ int main(int argc, char **argv) > > free(buffer); > > die("could not read from stdin"); > > } > > + buffer[size] = 0; > > Ah, so you terminate the buffer here. From the patch, it is relatively > hard to see if this line is always hit _before_ the function is called > that evidently relies on NUL termination. By moving it here, I think it is > much more likely to overlook the fact that the function, which made sure > that its assumption was met, needs this line. Whereas if you left it where > it was, the assumption would always be met. But if I leave the NUL termination within the function I would have to backtrack out of the function to all of its potential callers and check whether it's safe to write to index size. Since the word "size" could easily mean "allocated size" I would have the initial feeling that this might be a buffer overflow, i.e. _not_ safe. In the end, I think the best solution is to make sure NUL termination happens before calling the function, and then documenting explicitly that the function assumes NUL terminated input. Which is exactly what I end up with at the end of the patch series. > > - sig_line++; > > + tagger_line++; > > I am really reluctant with renamings like these. IMHO they don't buy you > much, except for possible confusion. It is evident that sig means the > signer (and it is obvious in the case of an unsigned tag, who is meant, > too). Hmm. The "type" line is found in the variable type_line, the "tag" line is found in the variable tag_line, and the "tagger" line is found in the variable ... sig_line? Nope, I don't buy it. ...Johan -- Johan Herland, <johan@xxxxxxxxxxx> www.herland.net - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html