On Fri, May 11, 2018 at 08:19:59AM +0200, Duy Nguyen wrote: > On Fri, May 11, 2018 at 6:49 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > > Junio C Hamano <gitster@xxxxxxxxx> writes: > > > >> René Scharfe <l.s.r@xxxxxx> writes: > >> > >>>> But it somehow feels backwards in spirit to me, as the reason why we > >>>> use "void *" there in the decoration field is because we expect that > >>>> we'd have a pointer to some struture most of the time, and we have > >>>> to occasionally store a small integer there. > >>> > >>> Yes, fast-export seems to be the only place that stores an integer as > >>> a decoration. > >> > >> With the decoration subsystem that might be the case, but I think > >> we have other codepaths where "void * .util" field in the structure > >> is used to store (void *)1, expecting that a normal allocation will > >> never yield a pointer that is indistinguishable from that value. > > > > I was looking at a different topic and noticed that bisect.c uses > > commit->util (which is of type "void *") to always store an int (it > > never stores a pointer and there is no mixing). This one is equally > > unportable as fast-export after your fix. > > > > In both cases we should be able to use commit-slab instead of > commit->util. We could go even further and kill "util" pointer but > that's more work. I would love it if we could kill the util pointer in favor of commit-slab. Unfortunately the fast-export case is decorating non-commit objects, too. I'm not sure how possible/easy it would be to have an "object slab". Some complications are: 1. It would increase the size of "struct tree" and "struct blob" by at least 4 bytes (possibly 8, due to alignment) to store the slab index. Commits would stay the same, but my experience is that most repositories have 5-10 times as many trees/blobs as commits. Some of that memory might be reclaimable. E.g., if we moved tree->buffer and tree->size into a slab, callers which don't use them would actually come out ahead (and more callers than you might expect could do this, since many only need to look at the tree contents inside a single function). 2. If we allocate the indices for all types as a single sequence, then callers who only care about a specific type will have very sparse slabs (so they'll over-allocate, and it will have poor cache behavior). It might be possible to do something more clever. E.g., give each object type its own index counter, but then for a full object-slab, store per-type slabs and dereference obj->type to know which slab to look in. There'd be some complication with any_object. We'd probably need an OBJ_NONE slab, and then to allocate a type-specific index when the type is assigned. There's already a central point for this in object_as_type(). But we'd also have to migrate any previously-stored slab data (stored when it was OBJ_NONE), which implies having a global list of slabs. So I dunno. It seems do-able but complicated. -Peff