On Fri, Jul 05, 2019 at 01:14:13AM -0400, Jeff King wrote: > On Thu, Jul 04, 2019 at 10:13:20PM +0900, Mike Hommey wrote: > > > > "public-inbox-index" (reading from git, writing to Xapian+SQLite) > > > on a dev machine got slow because core count exceeded what SATA > > > could handle and had to cap the default Xapian shard count to 3 > > > by default for v2 inboxes. > > > > AFAICT, git doesn't write from multiple threads. > > Right. That's always single threaded, and the main difference there is > going to be what's in the delta base cache. > > > Oh right, I forgot to mention: > > - I thought this memory usage thing was [1] but it turns out it was real > > memory usage. > > - glibc's mallinfo stores values as int, so it's useless to know how > > much memory was allocated when it's more than 4GB. > > - glibc's malloc_stats relies on the same int data, so while it does > > print "in use" data, it can't print values above 4GB correctly. > > - glibc has a malloc_stats function that, according to its manual page > > "addresses the deficiencies in malloc_stats and mallinfo", but while > > it outputs a large XML dump, it doesn't contain anything that looks > > remotely like the "in use" from malloc_stats. > > - So all in all, I used jemalloc to gather the "allocated" stats. > > I think I explained all of the memory-usage questions in my earlier > response, but just for reference: if you have access to it, valgrind's > "massif" tool is really good for this kind of profiling. Something like: > > valgrind --tool=massif git pack-objects ... > ms_print massif.out.* > > which shows heap usage at various times, points out the snapshot with > peak usage, and shows a backtrace of the main culprits at a few > snapshots. At the expense of time ;) A run would likely last an entire day under massif (by which I mean a full 24 hours, not a 9-5 day). Mike