On Fri, Jan 28, 2011 at 20:08, Nicolas Pitre <nico@xxxxxxxxxxx> wrote: >> pack is actually smaller 376.30 MiB vs. C Git's 380.59 MiB. I point >> out this data because improvements made to JGit may show similar >> improvements to CGit given how close they are in running time. > > What are those improvements? None right now. JGit is similar to CGit algorithm-wise. (Actually it looks like JGit has a faster diff implementation, but that's a different email.) If you are asking about why JGit created a slightly smaller pack file... it splits the delta window during threaded delta search differently than CGit does, and we align our blocks slightly differently when comparing two objects to generate a delta sequence for them. These two variations mean JGit produces different deltas than CGit does. Sometimes we are smaller, sometimes we are larger. But its a small difference, on the order of 1-4 MiB for something like linux-2.6. I don't think its worthwhile trying to analyze the specific differences in implementations and retrofit those differences into the other one. What I was trying to say was, _if_ we made a change to JGit and it dropped the running time, that same change in CGit should have _at least_ the same running time improvement, if not better. I was pointing out that this cached-pack change dropped the running time by 1 minute, so CGit should also see a similar improvement (if not better). I would prefer to test against CGit for this sort of thing, but its been too long since I last poked pack-objects.c and the revision code in CGit, while the JGit equivalents are really fresh in my head. > Now, the fact that JGit is so close to CGit must be because the actual > cost is outside of them such as within zlib, otherwise the C code should > normally always be faster, right? Yup, I mostly agree with this statement. CGit does a lot of malloc/free activity when reading objects in. JGit does too, but we often fit into the young generation for the GC, which sometimes can be faster to clean and recycle memory in. We're not too far off from C code. But yes... our profile looks like this too: > Looking at the profile for "git rev-list --objects --all > /dev/null" > for the object enumeration phase, we have: > > # Samples: 1814637 > # > # Overhead Command Shared Object Symbol > # ........ ............... ............. ...... > # > 28.81% git /home/nico/bin/git [.] lookup_object > 12.21% git /lib64/libz.so.1.2.3 [.] inflate > 10.49% git /lib64/libz.so.1.2.3 [.] inflate_fast > 7.47% git /lib64/libz.so.1.2.3 [.] inflate_table > 6.66% git /lib64/libc-2.11.2.so [.] __GI_memcpy > 5.66% git /home/nico/bin/git [.] find_pack_entry_one > 2.98% git /home/nico/bin/git [.] decode_tree_entry > [...] > > So we've got lookup_object() clearly at the top. Isn't this the hash table lookup inside the revision pool, to see if the object has already been visited? That seems horrible, 28% of the CPU is going to probing that table. > I suspect the > hashcmp() in there, which probably gets inlined, is responsible for most > cycles. Probably true. I know our hashcmp() is inlined, its actually written by hand as 5 word compares, and is marked final, so the JIT is rather likely to inline it. > There is certainly a better way here, and probably in JGit you > rely on some optimized facility provided by the language/library to > perform that lookup. So there is probably some easy improvements that > can be made here. Nope. Actually we have to bend over backwards and work against the language to get anything even reasonably sane for performance. Our "solution" in JGit has actually been used by Rob Pike to promote his Go programming language and why Java sucks as a language. Its a great quote of mine that someone dragged up off the git@vger mailing list and started using to promote Go. At least once I week I envy how easy it is to use hashcmp() and hashcpy() inside of CGit. JGit's management of hashes is sh*t because we have to bend so hard around the language. > Otherwise it is at least 12.21 + 10.49 + 7.47 + 2.71 = 32.88% spent > directly in the zlib code, making it the biggest cost. Yea, that's what we have too, about 33% inside of zlib code... which is the same implementation that CGit uses. > This is rather > unavoidable unless the data structure is changed. We already knew this from our pack v4 experiments years ago. > And pack v4 would > probably move things such as find_pack_entry_one, decode_tree_entry, > process_tree and tree_entry off the radar as well. This is hard to do inside of CGit if I recall... but yes, changing the way trees are handled would really improve things. > The object writeout phase should pretty much be network bound. Yes. >> I fully implemented the reuse of a cached pack behind a thin pack idea >> I was trying to describe in this thread. It saved 1m7s off the JGit >> running time, but increased the data transfer by 25 MiB. > > Yeah... this sucks. Very much. :-( But this is a fundamental issue with our incremental fetch support anyway. In this exact case if the client was at that 1 month old commit, and fetched current master, he would pull 25 MiB of data.. but only needed about 4-6 MiB worth of deltas if it was properly delta compressed against the content we know he already has. Our server side optimization of only pushing the immediate "have" list of the client into the delta search window limits how much we can compress the data we are sending. If we were willing to push more in on the server side, we could shrink the incremental fetch more. But that's a CPU problem on the server. -- Shawn. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html