Jakub Narebski <jnareb@xxxxxxxxx> wrote: > Let's rephrase question a bit then: what low-level operation were needed > for good performance in JGit? Aside from the message I just posted: - Avoid String, its too expensive most of the time. Stick with byte[], and better, stick with data that is a triplet of (byte[], int start, int end) to define a region of data. Yes, its annoying, as its 3 values you need to pass around instead of just 1, but its makes a big difference in running time. - Avoid allocating byte[] for SHA-1s, instead we convert to 5 ints, which can be inlined into an object allocation. - Subclass instead of contain references. We extend ObjectId to attach application data, rather than contain a reference to an ObjectId. Classical Java programming techniques would say this is a violation of encapsulatio. But it gets us the same memory impact that C Git gets by saying: struct appdata { unsigned char[20] sha1; .... } - We're hurting dearly for not having more efficient access to the pack-*.pack file data. mmap in Java is crap. We implement our own page buffer, reading in blocks of 8192 bytes at a time and holding them in our own cache. Really, we should write our own mmap library as an optional JNI thing, and tie it into libz so we can efficiently run inflate() off the pack data directly. - We're hurting dearly for not having more efficient access to the pack-*.idx files. Again, with no mmap we read the entire bloody index into memory. But since you won't touch most of it we keep it in large byte[], but since you are searching with an ObjectId (5 ints) we pay a conversion price on every search step where we have to copy from the large byte[] to 5 local variable ints, and then compare to the ObjectId. Its an overhead C git doesn't have to deal with. Anyway. I'm still just amazed at how well JGit runs given these limitations. I guess that's Moore's Law for you. 10 years ago, JGit wouldn't have been practical. -- Shawn. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html