On 4/9/07, Shawn O. Pearce <spearce@xxxxxxxxxxx> wrote:
Nicolas Pitre <nico@xxxxxxx> wrote: > I'd be really tempted to create a pack v4 which only change is to still > have the pack header at the beginning of the pack like we do today, but > include the header in the pack SHA1 computation at the end of the stream > only. This way the pack SHA1 could be computed as the pack is > generated, and the header fixed up without having to read the entire > pack back. I think it was Geert Bosch who proposed this and it makes > tons of sense IMHO. Yes. If we really are heading in this direction of needing to correct object counts, we should make that change. Its trivial to hang onto that header for the duration of the rest of the data processing, and tack it onto the end for final SHA-1 computation.
I like the property that when an SHA-1 appears at the end of a file, it is a checksum of every byte before it. The ideas above are a departure from that. Do we want this rule to be different for each file type? Wouldn't the following address the "object count unknown at the start of sequential pack writing" problem: Write 0 for object count in the header. This is a flag to look for another header of same format just before the final SHA-1 which has the correct count. The SHA-1 is still a checksum of everything before it and no seeking/rewriting is needed on generation. When reading the object count from a .pack file, you might need to add xread(pack_fd, &header, sizeof(header)); + if (!header.object_count) { + lseek(pack_fd, -20-sizeof(header), SEEK_END); + xread(pack_fd, &header, sizeof(header); + } Or maybe you want this before the object_list_sha1 instead (20->40). Finally, when I generate several 2GB split packfiles, I do notice the slight delay for fixup_header_footer(), and I do think it's a bit ugly, but in quantitative terms it's an insignificant part of a long operation that's infrequently performed. Does this need to be optimized at all? Thanks, -- Dana L. How danahow@xxxxxxxxx +1 650 804 5991 cell - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html