Re: pack operation is thrashing my server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nicolas Pitre <nico@xxxxxxx> wrote:
> On Wed, 13 Aug 2008, Shawn O. Pearce wrote:
> > 
> > Where little memory systems get into trouble with already packed
> > repositories is enumerating the objects to include in the pack.
> 
> I'm counting something like 104 bytes on a 64-bit machine for
> struct object_entry.

Don't forget that we need not just struct object_entry, but
also the struct commit/tree/blob, their hash tables, and the
struct object_entry* in the sorted object list table, and
the pack reverse index table.  It does add up.
 
> > Have 500k objects and its suddenly something quite real in terms
> > of memory usage.
> 
> Well, we are talking about 50MB which is not that bad.

I think we're closer to 100MB here due to the extra overheads
I just alluded to above, and which weren't in your 104 byte
per object figure.

> However there is a point where we should be realistic and just admit 
> that you need a sufficiently big machine if you have huge repositories 
> to deal with.  Git should be fine serving pull requests with relatively 
> little memory usage, but anything else such as the initial repack simply 
> require enough RAM to be effective.

Yea.  But it would also be nice to be able to just concat packs
together.  Especially if the repository in question is an open source
one and everything published is already known to be in the wild,
as say it is also available over dumb HTTP.  Yea, I know people
like the 'security feature' of the packer not including objects
which aren't reachable.

But how many times has Linus published something to his linux-2.6
tree that he didn't mean to publish and had to rewind?  I think
that may be "never".  Yet how many times per day does his tree get
cloned from scratch?

This is also true for many internal corporate repositories.
Users probably have full read access to the object database anyway,
and maybe even have direct write access to it.  Doing the object
enumeration there is pointless as a security measure.

I'm too busy to write a pack concat implementation proposal, so
I'll just shutup now.  But it wouldn't be hard if someone wanted
to improve at least the initial clone serving case.

-- 
Shawn.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux