On Wed, May 2, 2018 at 8:07 PM, Jameson Miller <jamill@xxxxxxxxxxxxx> wrote: > > >> -----Original Message----- >> From: Duy Nguyen <pclouds@xxxxxxxxx> >> Sent: Wednesday, May 2, 2018 1:02 PM >> To: Stefan Beller <sbeller@xxxxxxxxxx> >> Cc: Git Mailing List <git@xxxxxxxxxxxxxxx>; Jameson Miller >> <jamill@xxxxxxxxxxxxx> >> Subject: Re: [PATCH 00/13] object store: alloc >> >> On Tue, May 1, 2018 at 11:33 PM, Stefan Beller <sbeller@xxxxxxxxxx> wrote: >> > I also debated if it is worth converting alloc.c via this patch series >> > or if it might make more sense to use the new mem-pool by Jameson[1]. >> > >> > I vaguely wonder about the performance impact, as the object >> > allocation code seemed to be relevant in the past. >> >> If I remember correctly, alloc.c was added because malloc() has too high >> overhead per allocation (and we create like millions of them). As long as you >> keep allocation overhead low, it should be ok. Note that we allocate a lot more >> than the mem-pool's main target (cache entries if I remember correctly). We >> may have a couple thousands cache entries. We already deal with a couple >> million of struct object. > > The work to move cache entry allocation onto a memory pool was motivated by > the fact that we are looking at indexes with millions of entries. If there is scaling > concern with the current version of mem-pool, we would like to address it there > as well. Or if there is improvements that can be shared, that would be nice as well. I think the two have quite different characteristics. alloc.c code is driven by overhead. struct blob is only 24 bytes each and about 1/3 the repo is blobs, and each malloc has 16 bytes overhead or so if I remember correctly. struct cache_entry at minimum in 88 bytes so relative overhead is not that a big deal (but sure reducing it is still very nice). mem-pool is about allocation speed, but I think that's not a concern for alloc.c because when we do full rev walk, I think I/O is always the bottleneck (maybe object lookup as well). I don't see a good way to have the one memory allocator that satisfyies both to be honest. If you could allocate fixed-size cache entries most of the time (e.g. larger entries will be allocated using malloc or something, and should be a small number), then perhaps we can unify. Or if mem-pool can have an option to allocated fixed size pieces with no overhead, that would be great (sorry I still have not read that mem-pool series yet) -- Duy