On Wed, 2016-06-01 at 16:09 -0400, Jeff King wrote: > On Wed, Jun 01, 2016 at 03:50:29PM -0400, David Turner wrote: > > > On Wed, 2016-06-01 at 03:42 -0400, Jeff King wrote: > > > 2. Do caching tricks for strbufs used in tight loops. For > > > example, > > > have strbuf_release() throw its buffer into a last-used > > > cache, > > > and > > > let the next strbuf_grow() use that cache entry. This cuts > > > malloc() > > > out of the loop. > > > > > > You'd probably want to protect the cache with a mutex, > > > though. > > > > > > ... or make the last-used cache be thread-local. > > Good idea. > > I almost didn't mention threading at all, because I'd be surprised if > malloc lock contention is a serious bottleneck for us anywhere. > > It's hard to really say much concrete because it's not clear to me > where > people think strbuf allocation _is_ a bottleneck (or again, _would > be_ > if we were to use it more). > > If we imagine a loop like this: > > for (i = 0; i < n; i++) { > struct strbuf buf = STRBUF_INIT; > do_something_with(&buf); > strbuf_release(&buf); > } > > then yes, we'll call malloc() and free() a lot of times. But unless > your > libc malloc is terrible, those calls are all probably just taking a > mutex and reusing the same block from a free list. The strbuf case is > actually really friendly to allocators because our blocks tend to be > predictable sizes (they're sized by ALLOC_GROW, which has a > deterministic set of block sizes). > > In practice, I'd imagine loops get more complicated than that. They > call > sub-functions which allocate a strbuf, and sometimes don't release it > until other strbufs have been allocated, etc. The proposal in this > thread is basically using the call stack to mirror the allocation > patterns. So when the pattern of allocations matches an actual stack, > that's good, but I'd expect a reasonably smart allocator to perform > similarly. And when it's not a true stack, the stack-based strbufs > are > going to end up with wasted stack space hanging around even after > we've > free()'d the memory and could reuse it. And I'd still expect an > allocator based off a linked list or something to handle such cases > pretty well. > > There are also other non-big-O factors at play in modern systems, > like > CPU cache footprints. Is heap memory cheaper or more expensive to > access > than stack? I can imagine stack is kept in the cache better, but if > you're reusing the same heap block over and over, it probably stays > in > cache, too. But if you have unused stack buffers that _could_ be > reused > (but won't be because you'd have to manually feed them to a new > strbuf), > that hurts your footprint. And on top of that, the stack proposals > I've > seen generally involve over-sizing the stack buffers out of a fear of > actually calling malloc. > > And then on top of that, there is the question of whether anything > involving a strbuf allocation is actually a tight enough loop to even > _care_ about cache dynamics. > > So again, I'm slightly negative on anything that makes the code even > slightly more complex, especially the call sites, if we can't show > actual measurable improvement numbers. > > Sorry, this turned into a mini-rant, and it's not really directed at > you, David. Your email was just what spurred me to put some of these > thoughts into words. :) I agree with all this stuff anyway. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html