Johannes Sixt said the following on 04.04.2009 08:35:
On Samstag, 4. April 2009, Marius Storm-Olsen wrote:
Pat Thoyts said the following on 03.04.2009 23:12:
The difference on Windows Vista is that the low fragmentation
heap is the default memory allocator. On Windows XP you need to
enable it specifically for an application. So a possible
alternative to this is just to enable the low fragmentation
heap. (done via GetProcessHeaps and HeapSetInformation Win32
API calls).
I know about the low-fragmentation heap, but given that it was
only supported on XP and up (and given that I also had MacOSX in
mind when considering a custom allocator; see MacOSX got 12%
itself ;-), I didn't even consider it. Thanks for clearing up the
differences on the Vista and XP benchmarks though! Makes sense.
Wouldn't a GetProcessHeaps/HeapSetInformation solution add much
less code, even with a runtime check whether the feature is
supported?
It certainly would, if you'd also like to simply ignore the extra
benefit to MacOSX, which was a not-so-bad additional 12%. However, I
and several of my colleagues also use Git on Mac and see that any
improvement in the performance there would also be welcome. So, I went
with the custom allocator approach, instead of just looking into XP.
There might also be other platforms which could benefit from such a
custom allocator, so I figured that there were many positive sides to
this, rather than just going for the Low Fragmentation Heap on Windows.
The improvement that you observed is in a rather special area
(repack). How is the improvment in day-to-day tools:
- procelains used on command line: git-status, git-add, git-commit,
git-diff, git-log, perhaps even local git-fetch.
- plumbing used by guis: git-diff-files, git-diff-tree, git-log,
git-rev-parse
- I'm not even mentioning git-am, git-rebase, because here the time
sink is the fork emulation.
I doubt that the improvement is equally great, and it will perhaps
vanish in the noise. 7000+ LOC is a bit much in this case, don't
you think so?
I went with repack because it's a lot of data munging, and not so high
IO. Clearly more I/O intensive git operations would not benefit as
much as repack. But the goal in this patch was not to speed up I/O.
Obviously there are things that can be, and should be done for the I/O
side too, but that's a separate subject.
I don't see the 7000+ LOC as such a big deal, given that they are all
neatly tucked away in a compat subdirectory. They don't even add any
additional sourcecode to the codepaths, since you just link with it.
Given the benefits, even 5% better than the Low-Frag case for
single-threaded cases (which is the most dominant in git anyways), it
think it's reason enough to include it. The 12% boost on Mac should
also underline this.
Running 'git blame' on one of the files in my repos gives me this result:
XP
Without nedmalloc: 11.218sec
With nedmalloc : 9.514sec (18% speedup)
OSX
Without nedmalloc: 15.046sec
With nedmalloc : 13.957sec (8% speedup)
I'll take those speedups any day :-)
BTW, I assume that the Boost license is compatible with GPL. But
did you check that?
Of course I did, you'll find it under
http://www.fsf.org/licensing/licenses/index_html#GPLCompatibleLicenses
--
.marius
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html