> -----Original Message----- > From: git-owner@xxxxxxxxxxxxxxx <git-owner@xxxxxxxxxxxxxxx> On Behalf Of > Stefan Beller > Sent: Thursday, May 3, 2018 4:59 PM > To: Duy Nguyen <pclouds@xxxxxxxxx> > Cc: Jameson Miller <jamill@xxxxxxxxxxxxx>; git@xxxxxxxxxxxxxxx; > gitster@xxxxxxxxx; jonathantanmy@xxxxxxxxxx > Subject: Re: [PATCH v2 0/5] Allocate cache entries from memory pool > > On Thu, May 3, 2018 at 12:17 PM, Duy Nguyen <pclouds@xxxxxxxxx> wrote: > > > >> To me it is also a clear yes when it comes to combining these two > >> memory pools. > > > > I also did not notice that jm/mem-pool already landed in master. > > Oh, thanks for telling! Now that I look at it, I am doubting it; > > The reason for my doubt is the potential quadratic behavior for new allocations, > in mem_pool_alloc() we walk all mp_blocks to see if we can fit the requested > allocation in one of the later blocks. > So if we call mem_pool_alloc a million times, we get a O(n) mp_blocks which > we'd have to walk in each call. With the current design, when a new mp_block is allocated, it is placed at the head of the linked list. This means that the most recently allocated mp_block is the 1st block that is searched. The *vast* majority of allocations should be fulfilled from this 1st block. It is only when the block is full that we search other mp_blocks in the list. If this is a concern, I think we have a couple low cost options to mitigate it (maybe a flag to control whether we search past the 1st mp_block for space, or logic to move blocks out of the search queue when they are full or fall below a threshold for available space). If this is of interest, I could contribute a patch to enable one of these behaviors? > > However in alloc.c we do know that a slab is full as soon as we look take the > next slab. That is the beauty of knowing 'len' at construction time of the > allocator. > > So I guess I'll just re-use the mp_block and introduce another struct > fixed_sized_mem_pool, which will not look into other mp_blocks but the > current. > > > > Have > > you tried measure (both memory usage and allocation speed) of it and > > alloc.c? > > No, I was about to, but then started reading the code in an attempt to replace > alloc.c by a mempool and saw the quadratic behavior. > > > Just take some big repo as an example and do count-objects -v to see > > how many blobs/trees/commits it has, then allocate the same amount > > with both alloc.c and mem-pool.c and measure both speed/mem. > > I'm pretty sure you're right that mem-pool.c is a clear yes. I was > > just being more conservative because we do (slightly) change > > allocator's behavior when we make the switch. But it's also very > > likely that any performance difference will be insignificant. > > > > I'm asking this because if mem-pool.c is a clear winner, you can start > > to update you series to use it now and kill alloc.c in the process. > > I'll implement the fixed_sized_mem_pool and take some measurements. > > > > > PS. Is Jeff back yet? > > His last email on the public list is Apr 10th, stating that he'll be offline for "a few > weeks", in <20180406175349.GB32228@xxxxxxxxxxxxxxxxxxxxx> he said the > vacation part is 3 weeks. So I think he is done with vacation and is just hiding to > figure out a nice comeback. ;-) > > > I'm sure Junio is listening and all but I'm afraid he's too busy being > > a maintainer so Jeff's opinion in this area is really valuable. He has > > all the fun and weird use cases to play with at github. > > ok. I'll cc him for these patches. > > Thanks, > Stefan