Am 27.06.21 um 04:47 schrieb Eric Wong: > There's no point in using 8 bits per-directory when 1 bit > will do. This saves us 224 bytes per object directory, which > ends up being 22MB when dealing with 100K alternates. The point was simplicity under the assumption that the number of repositories is low -- for most users it's only one. That obviously doesn't hold for your use case anymore. :) A compact representation should also reduce dcache misses, so this should be a win for the single-repo case as well. > Signed-off-by: Eric Wong <e@xxxxxxxxx> > --- > object-file.c | 10 +++++++--- > object-store.h | 2 +- > 2 files changed, 8 insertions(+), 4 deletions(-) > > diff --git a/object-file.c b/object-file.c > index 6be43c2b60..2c8b9c05f9 100644 > --- a/object-file.c > +++ b/object-file.c > @@ -2463,12 +2463,16 @@ struct oid_array *odb_loose_cache(struct object_directory *odb, > { > int subdir_nr = oid->hash[0]; > struct strbuf buf = STRBUF_INIT; > + size_t BM_SIZE = sizeof(odb->loose_objects_subdir_seen[0]) * CHAR_BIT; With that name I'd expect the variable to contain the number of bytes or bits in the whole bitmap. And to not be a variable at all, but rather a macro. Perhaps word_bits? bitsizeof() does the same and is slightly shorter. > + uint32_t *bitmap; Ah, you call the array items bitmap, which they are. Hmm. I rather think of the whole thing as a bitmap and its uint32_t elements as words. Does it matter? Not sure. > + uint32_t bit = 1 << (subdir_nr % BM_SIZE); I'd call that mask, but bit is fine as well.. Anyway, it would look something like this: size_t word_bits = bitsizeof(odb->loose_objects_subdir_seen[0]); size_t word_index = subdir_nr / word_bits; size_t mask = 1 << (subdir_nr % word_bits); > > if (subdir_nr < 0 || > - subdir_nr >= ARRAY_SIZE(odb->loose_objects_subdir_seen)) > + subdir_nr >= ARRAY_SIZE(odb->loose_objects_subdir_seen) * BM_SIZE) bitsizeof(odb->loose_objects_subdir_seen) would be easier to read and understand, I think. > BUG("subdir_nr out of range"); > > - if (odb->loose_objects_subdir_seen[subdir_nr]) > + bitmap = &odb->loose_objects_subdir_seen[subdir_nr / BM_SIZE]; > + if (*bitmap & bit) > return &odb->loose_objects_cache[subdir_nr]; > > strbuf_addstr(&buf, odb->path); > @@ -2476,7 +2480,7 @@ struct oid_array *odb_loose_cache(struct object_directory *odb, > append_loose_object, > NULL, NULL, > &odb->loose_objects_cache[subdir_nr]); > - odb->loose_objects_subdir_seen[subdir_nr] = 1; > + *bitmap |= bit; > strbuf_release(&buf); > return &odb->loose_objects_cache[subdir_nr]; > } > diff --git a/object-store.h b/object-store.h > index 20c1cedb75..8fcddf3e65 100644 > --- a/object-store.h > +++ b/object-store.h > @@ -22,7 +22,7 @@ struct object_directory { > * > * Be sure to call odb_load_loose_cache() before using. > */ > - char loose_objects_subdir_seen[256]; > + uint32_t loose_objects_subdir_seen[8]; /* 256 bits */ Perhaps DIV_ROUND_UP(256, bitsizeof(uint32_t))? The comment explains it nicely already, though. > struct oid_array loose_objects_cache[256]; > > /* > Summary: Good idea, the implementation looks correct, I stumbled over some of the names, bitsizeof() could be used. René