On Wed, 13 Aug 2008, Shawn O. Pearce wrote: > Nicolas Pitre <nico@xxxxxxx> wrote: > > You'll have memory usage issues whenever such objects are accessed, > > loose or not. However, once those big objects are packed once, they can > > be repacked (or streamed over the net) without really "accessing" them. > > Packed object data is simply copied into a new pack in that case which > > is less of an issue on memory usage, irrespective of the original pack > > size. > > And fortunately here we actually do stream the objects we have > chosen to reuse from the pack. We don't allocate the entire thing > in memory. Its probably the only place in all of Git where we can > handle a 16 GB (after compression) object on a machine with only > 2 GB of memory and no swap. > > Where little memory systems get into trouble with already packed > repositories is enumerating the objects to include in the pack. > This can still blow out their physical memory if the number of > objects to pack is high enough. We need something like 160 bytes > of memory (my own memory is fuzzy on that estimate) per object. I'm counting something like 104 bytes on a 64-bit machine for struct object_entry. > Have 500k objects and its suddenly something quite real in terms > of memory usage. Well, we are talking about 50MB which is not that bad. However there is a point where we should be realistic and just admit that you need a sufficiently big machine if you have huge repositories to deal with. Git should be fine serving pull requests with relatively little memory usage, but anything else such as the initial repack simply require enough RAM to be effective. Nicolas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html