On Monday 23. May 2011, Shawn Pearce wrote: > We can still get a tighter estimate if we wanted to. I wouldn't mix > it into this patch, but make a new one on top of it. During delta > compression we hold onto deltas, or at least compute and retain the > size of the chosen delta. We could re-check the pack size after the > Compressing phase by including the delta sizes in the estimate, and > if we are over, abort before writing. Ok. Not sure when I'll have the time/courage to dive into this, but I'll at least give it a try. > For non-delta, non-reuse we may be able to guess by just using the > loose object size. The loose object is most likely compressed at the > same compression ratio as the outgoing pack stream will use, so a > deflate(inflate(loose)) cycle is going to be very close in total > bytes used. If we over shoot the limit by more than some fudge > factor (say 8K in 1M limit or 0.7%), abort before writing. I already have an unsubmitted patch on top of the series that includes the on-disk/compressed size of loose objects in the estimate. However, it's quite intrusive (need to extend sha1_object_info() to return compressed size of loose objects). Also, since I don't yet take the delta compression into account, these numbers are obviously unreliable. That said, in the cases where loose objects are not deltified it seems the compressed/loose versions are about 3 to 7 bytes larger than the corresponding compressed/packed versions. I guess this is due to the loose files using a "<type> SP <size> NUL" text header (deflated), whereas the pack uses a more compact binary format (not deflated). We could test a large corpus (e.g. linux-kernel) to find the average difference between compressed/loose size and compressed/packed size, and then multiply this with the number of non-delta, non-reuse object to determine the fudge factor you describe above. Have fun! :) ...Johan -- Johan Herland, <johan@xxxxxxxxxxx> www.herland.net -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html