On 4/3/2018 4:20 PM, Jeff King wrote:
On Tue, Apr 03, 2018 at 09:14:50AM -0400, Derrick Stolee wrote:
I'm not sure what the exact solution would be, but I imagine something
like variable-sized "struct commit"s with the parent pointers embedded,
with some kind of flag to indicate the number of parents (and probably
some fallback to break out to a linked list for extreme cases of more
than 2 parents). It may end up pretty invasive, though, as there's a
lot of open-coded traversals of that parent list.
Anyway, not anything to do with this patch, but food for thought as you
micro-optimize these traversals.
One other thing that I've been thinking about is that 'struct commit' is so
much bigger than the other structs in 'union any_object'. This means that
the object cache, which I think creates blocks of 'union any_object' for
memory-alignment reasons, is overly bloated. This would be especially true
when walking many more trees than commits.
Perhaps there are other memory concerns in that case (such as cached
buffers) that the 'union any_object' is not a concern, but it is worth
thinking about as we brainstorm how to reduce the parent-list memory.
It definitely bloats any_object, but I don't think we typically allocate
too many of those. Those should only come from lookup_unknown_object(),
but typically we'll come across objects by traversing the graph, in
which case we have an expectation of the type (and use the appropriate
lookup_foo() function, which uses the type-specific block allocators).
Thanks for the clarification here. I'm glad I was wrong about how often
any_object is used.
Thanks,
-Stolee