On 24/01/17 00:46, Jeff King wrote: > This is similar to many of our uses of sha1-array, but it > overcomes one limitation of a sha1-array: when you are > de-duplicating a large input with relatively few unique > entries, sha1-array uses 20 bytes per non-unique entry. > Whereas this set will use memory linear in the number of > unique entries (albeit a few more than 20 bytes due to > hashmap overhead). > > Signed-off-by: Jeff King <peff@xxxxxxxx> > --- > This may be overkill. You can get roughly the same thing by making > actual object structs via lookup_unknown_object(). But see the next > patch for some comments on that. > > Makefile | 1 + > oidset.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++ > oidset.h | 45 +++++++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 95 insertions(+) > create mode 100644 oidset.c > create mode 100644 oidset.h > > diff --git a/Makefile b/Makefile > index 27afd0f37..e41efc2d8 100644 > --- a/Makefile > +++ b/Makefile > @@ -774,6 +774,7 @@ LIB_OBJS += notes-cache.o > LIB_OBJS += notes-merge.o > LIB_OBJS += notes-utils.o > LIB_OBJS += object.o > +LIB_OBJS += oidset.o > LIB_OBJS += pack-bitmap.o > LIB_OBJS += pack-bitmap-write.o > LIB_OBJS += pack-check.o > diff --git a/oidset.c b/oidset.c > new file mode 100644 > index 000000000..6094cff8c > --- /dev/null > +++ b/oidset.c > @@ -0,0 +1,49 @@ > +#include "cache.h" > +#include "oidset.h" > + > +struct oidset_entry { > + struct hashmap_entry hash; > + struct object_id oid; > +}; > + > +int oidset_hashcmp(const void *va, const void *vb, static int oidset_hashcmp( ... ATB, Ramsay Jones