Re: [PATCH v3 2/2] alloc.c: remove the redundant commit_count variable

Jeff King <peff@xxxxxxxx> · Fri, 11 Jul 2014 04:32:20 -0400

On Fri, Jul 11, 2014 at 01:59:53AM +0100, Ramsay Jones wrote:

> > The code you're touching here was trying to make sure that each commit
> > gets a unique index, under the assumption that commits only get
> > allocated via alloc_commit_node. But I think that assumption is wrong.
> > We can also get commit objects by allocating an OBJ_NONE (e.g., via
> > lookup_unknown_object) and then converting it into an OBJ_COMMIT when we
> > find out what it is.
> 
> Hmm, I don't know how the object is converted, but the object allocator
> is actually allocating an 'union any_object', so it's allocating more
> space than for a struct object anyway.

Right, we would generally want to avoid lookup_unknown_object where we
can for that reason.

> If you add an 'index' field to struct object, (and remove it from
> struct commit) it could be set in alloc_object_node(). ie _all_ node
> types get an index field.

That was something I considered when we did the original commit-slab
work, as it would let you do similar tricks for any set of objects, not
just commits. The reasons against it are:

  1. It would bloat the size of blob and tree structs by at least 4
     bytes (probably 8 for alignment). In most repos, commits make up
     only 10-20% of the total objects (so for linux.git, we're talking
     about 25MB extra in the working set).

  2. It makes single types sparse in the index space. In cases where you
     do just want to keep data on commits (and that is the main use),
     you end up allocating a slab entry per object, rather than per
     commit. That wastes memory (much worse than 25MB if your slab items
     are large), and reduces cache locality.

You could probably get around (2) by splitting the index space by type
and allocating them in pools, but that complicates things considerably,
as you have to guess ahead of time at reasonable maximums for each type.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html